• Announcements

    • khawk

      Download the Game Design and Indie Game Marketing Freebook   07/19/17

      GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Sign in to follow this  
Followers 0
datahead8888

Using the product rule for a partial derivative of a vector/matrix function...

7 posts in this topic

I was working on a physics animation, and I need to compute the Jacobian in order to solve a partial differential equation.

 

Suppose we have a function consisting of a series of matrices multiplied by a vector:
f(X) = A * B * b
--where X is a vector containing elements that are contained within A, b, and/or b,
--A is a matrix, B is a matrix, and b is a vector

Each Matrix and the vector is expressed as more terms, ie...
X = (x1, x2, x3)

A =
[ x1 + y1        y4       y7 ]
[      y2   x2 + y5       y8 ]
]      y3        y6  x3 + y9 ]

B =
[      y1   x2 + y4  x3 + y7 ]
[x1 +  y2        y5       y8 ]
]      y3        y6       y9 ]

b = [y1 y2 y3]'  (' means transposed)

Now we want to find the Jacobian of f - ie the partial derivative of f wrt X.

One way to do this is to multiply the two matrices and then multiply that by the vector, creating one 3x1 vector in which each element is an algebraic expression resulting from matrix multiplication.  The partial derivative could then be computed per element to form a 3x3 Jacobian.  This would be feasible in the above example, but the one I'm working is a lot more complicated (and so I would also have to look for patterns in order to simplify it afterwards).

I was wanting to try to use the chain rule and/or the product rule for partial derivatives if possible.  However, with the product rule you end up with A' * B * b + A * B' * b + A * B * b', where each derivative is wrt to the vector X.  I understand that the derivative of a matrix wrt a vector is actually a 3rd order tensor, which is not easy to deal with.  If this is not correct, the other terms still have to evaluate to matrices in order for matrix addition to be valid.  If I use the chain rule instead, I still end up with the derivative of a matrix wrt a vector.

Is there an easier way to break down a matrix calculus problem like this?  I've scoured the web and cannot seem to find a good direction.
 

1

Share this post


Link to post
Share on other sites

To be clear - the Jacobian I find will be programmed into C++.

I'm not creating a system to find Jacobians arbitrarily.

0

Share this post


Link to post
Share on other sites

You can make a type that knows the value of a function and its derivatives with respect to some variables, overload operators and functions so the correct rules are applied, and then just evaluate an expression using this type will give you the Jacobian. Would something like that work for you?

 

Here's some code that does it for a single variable:

#include <iostream>
#include <cmath>

struct Jet {
  double value;
  double derivative;
 
  Jet(double value, double derivative = 0.0)
    : value(value),
      derivative(derivative) {
  }
};

Jet operator+(Jet j1, Jet j2) {
  return Jet(j1.value + j2.value, j1.derivative + j2.derivative);
}

Jet operator-(Jet j1, Jet j2) {
  return Jet(j1.value - j2.value, j1.derivative - j2.derivative);
}

Jet operator*(Jet j1, Jet j2) {
  return Jet(j1.value * j2.value, j1.value * j2.derivative + j2.value * j1.derivative);
}

Jet operator/(Jet j1, Jet j2) {
  return Jet(j1.value / j2.value, (j1.derivative * j2.value - j1.value * j2.derivative) / (j2.value * j2.value));
}

// More operators here...


Jet exp(Jet j) {
  return Jet(std::exp(j.value), std::exp(j.value) * j.derivative);
}

Jet log(Jet j) {
  return Jet(std::log(j.value), j.derivative / j.value);
}

Jet pow(Jet j1, Jet j2) {
  return exp(j2 * log(j1));
}

Jet sin(Jet j) {
  return Jet(std::sin(j.value), std::cos(j.value) * j.derivative);
}

Jet cos(Jet j) {
  return Jet(std::cos(j.value), -std::sin(j.value) * j.derivative);
}

// More functions here...

template <typename R>
R my_function(R x) {
  return x * sin(x) - pow(x, cos(x));
}

int main() {
  Jet x(2.0, 1.0);
 
  Jet fx = my_function(x);
 
  std::cout << fx.value << ' ' << fx.derivative << '\n';
}
 
1

Share this post


Link to post
Share on other sites

This probably would work if I just needed the derivative of something to be applied during a simple physics implementation, but I need to arbitrarily find the Jacobian matrix so that it can be fed into a conjugate gradient implementation that solves a system of equations.  What I'm doing is a bit theoretical, but it would still need be able to be implemented.

 

Writing a program like this to find it is an interesting idea - I wonder if I could construct some sort of list or other data structure that keeps track of my terms as they're found?

0

Share this post


Link to post
Share on other sites

I've worked a lot with matrix calculus and tensors (phd student, use it in my research all the time). In this post I'm going spend a lot of time explaining the basic ideas/techniques in the way I think about them, and then at the end apply it to your problem. Eventually we get a formula for the derivative object you want, K.

 

 

---- Background ----

 

If you want to generally understand matrix calculus, it's much easier if you get comfortable with tensors. Tensors are not so bad. The most important thing is understanding how to unfold them into matrices or vectors, do something to them, and then fold them back up into tensors. If you become comfortable with that, then everything becomes straightforward. The first few sections of this paper are really good, with pictures:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.2059&rep=rep1&type=pdf

 

If you have access to matlab and you want to get an intuitive feel for tensors, then I'd highly suggest making up tensors/higher dimensional arrays and playing around with the commands "reshape" and "permute" to get a feel for how it works.

 

 

---- Meaning and application of the third order derivative tensor ----

 

Ok now back to the original question. Let's think about a N-by-N matrix, A, parameterized by k variables, A(x_1,...,x_k). If you take the derivative, you get a k-by-N-by-N box of numbers A_prime, where

 - the first slice A_prime(1,all,all) is the derivative of A with respect to x_1,

 - the second slice A_prime(2,all,all) is the derivative of A with respect to x_2,

 - and so forth.

 

The first order of business is to understand how this third order tensor is applied to a vector to give the directional derivative in any given direction, v. The meaning of this directional derivative we want just what you expect - if you evaluate the matrix at the point X and at a nearby point X + epsilon*v, then how much does the matrix change. Ie,

 

[derivative of A in the direction v] = lim_s->0 (A(x1 + s*v1, x_2 + s*v2, ..., xk + s*vk) - A(x1, x_2, ..., xk))/s

 

To apply the third derivative tensor to a vector, you unfold A_prime into a k-by-N^2 matrix so as to expose the first mode to the left, and then left-multiply by the vector you are applying, and then fold the result back up into a tensor of one lower dimension (N-by-N matrix), like so:

 

[derivative of A in the direction v]  =  reshape(v^T * reshape(A_prime, k, N*N), N, N)^T

 

 

---- Including right matrix multiplication ----

 

The next order of business is to understand what to do when there is another matrix B multiplying on the right,

 

[derivative of A in the direction v]*B

 

Obviously if we had 'v', we could apply the reshaping procedure to find the derivative, and then multiply by B. However, we want to think of v as being a free variable, and find the object that describes the whole process at once. In other words, what is the 3rd order tensor, K, such that, when you do the standard reshaping and multiplying process, you get the desired result.

 

[derivative of A in the direction v]*B  =  reshape(v^T * reshape(K, k, N*N), N, N)^T //what is K though?

 

To get this K, we unfold A_prime to expose the rightmost mode, then rightmultiply by B, then fold it back up:

 

K = reshape(reshape(A_prime, k*N, N)*B, k, N, N)

 

 

---- Including left matrix multiplication ----

 

Now we know how to deal with matrices multiplying on the right, but what about if there is a matrix multiplying on the left instead? Ie:

 

B*[derivative of A in the direction v]

 

To deal with this, you have to permute the dimensions of the tensor to gain access to the left matrix mode, then unfold it, then left multiply by B. After that, you fold it back up and undo the permutation. The necessary permutation to move the middle mode to the left is [2,1,3], since that takes (k,N,N) -> (N,k,N). The inverse permutations is also [2,1,3] to do (N,k,N) -> (k,N,N). In detail, the overall tensor for the derivative and left multiplying process is,

 

K = permute(reshape(A*reshape(permute(A_prime, [2,1,3]), N, k*N), N, k, N), [2,1,3])

 

 

---- Including matrix sandwich ----

 

Ok, finally what about the combination of both - when the derivative is sandwiched inbetween matrices like so:

 

B*[derivative of A in the direction v]*C

 

Here you just do both of the things we did before - unfold to expose the right mode, right multiply by C, fold back up, permute and unfold to expose the middle mode on the left, left multiply by B, fold back up, and unpermute.

 

K = permute(reshape(B*reshape(permute(reshape(reshape(A_prime, k*N, N)*C, k, N, N), [2,1,3]), N, k*N), N, k, N), [2,1,3])

 

 

---- Application of the techniques to your problem ----

 

This covers all the cases necessary to deal with this sort of matrix equation. In our case, we seek to find a 3rd order k-by-N-by-1 tensor, K, such that

 

reshape(v^T * reshape(K, k, N*N), 1, N)^T = ...

[derivative of A in direction v]*B*b + ...
A*[derivative of B in direction v]*b + ...
A*B*[derivative of b in direction v]

 

Using the above techniques, the tensor is K = K1 + K2 + K3, where

 

 

K1 = reshape(reshape(A_prime, k*N, N)*B*b, k, N, 1)

K2 = permute(reshape(A*reshape(permute(reshape(reshape(B_prime, k*N, N)*b, k, N, 1), [2,1,3]), N, k*1), N, k, 1), [2,1,3])

K3 = permute(reshape(A*B*reshape(permute(b_prime, [2,1,3]), N, k*1), N, k, 1), [2,1,3])

 

Since b is a vector not a matrix, some of the stuff above is extraneous. Basically we can remove singleton dimensions like (k,N,1) -> (k,N), and replace permute [2,1] with a matrix transpose. These simplifications lead to the following form.

 

K1 = reshape(reshape(A_prime, k*N, N)*B*b, k, N)

K2 = (A*reshape(reshape(B_prime, k*N, N)*b, k, N)^T))^T

K3 = A*B*b_prime^T

 

and application via,

 

[derivative of f in direction v] = (v^T * K)^T

 

---- Conclusion ----

 

 

Hope this helps. It's a lot to absorb, but it eventualy makes sense. It took me a long time to understand this stuff, but I think the learning process could be greatly expedited if you play around with reshaping and permuting multidimensional arrays. Other than the paper linked at the start I haven't found many good resources to learn from. 

Edited by Nick Alger
1

Share this post


Link to post
Share on other sites

This probably would work if I just needed the derivative of something to be applied during a simple physics implementation, but I need to arbitrarily find the Jacobian matrix so that it can be fed into a conjugate gradient implementation that solves a system of equations.  What I'm doing is a bit theoretical, but it would still need be able to be implemented.

 

Writing a program like this to find it is an interesting idea - I wonder if I could construct some sort of list or other data structure that keeps track of my terms as they're found?

 

The extension of the technique I showed you to computing a Jacobian matrix is fairly straight-forward. If you can't see it, I can try to write another piece of sample code.

0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0