Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 03 Apr 2007
Offline Last Active Aug 04 2016 03:48 PM

#5093995 Kalman filter without using matrix inverse

Posted by on 14 September 2013 - 07:47 AM

There are three forms of the Kalman filter.  You might consider using another form,


There's the covariance form, which you are using.  This requires a matrix inversion at each step, but gives you the state estimate directly.  This is the form most people know.


There's also an information form, which works with the inverse of the covariance matrix (called the (Fisher) information matrix).  This does not require a matrix inversion at each step -- but the state estimate is not directly accessible. Instead of a state estimate, you have an information vector.  You can reconstruct the state estimate from the information vector, but this requires a matrix inverse.  One nice thing about this form is that you can specify a totally noninformative prior for the state by setting the initial information matrix to zero.  Also, there are reasonable assumptions under which information matrices are sparse (but covariance matrices are not), so this form can sometimes be efficient for really large problems with sparsity.


Finally, there's the square root form.  Here, instead of working with either the covariance or its inverse, you work with a matrix square root of the covariance matrix (e.g., its Cholesky decomposition), and actually never construct the covariance matrix directly.  This is usually the most computationally efficient filter. [EDIT: Actually, it's the most expensive; what it is is numerically stable.]  It also avoid the problems of ensuring that your covariance or information matrix remain positive definite in the face of roundoff errors, since it's impossible to represent anything but positive definite matrices this way [EDIT: this is largely the point of this form].

#5043428 Ordinary Differential Equation ?

Posted by on 15 March 2013 - 10:26 AM

I also think linear algebra is very important to appreciate some of the things you'll do in course on ODEs.  I'd say the most important concept for this purpose is eigendecomposition (eigenvectors and eigenvalues) and the spectral theorem.  For instance, local equilibria of ODEs are characterized by the eigendecomposition of the Jacobian; and ODEs like the heat and wave equations are themselves solved by eigendecomposing a linear operator (the sines/cosines are its eigenvectors).

#5032457 Ordinary Differential Equation ?

Posted by on 14 February 2013 - 09:31 PM

Linear algebra.

Linear algebra.

Linear algebra.

#5025805 Compressed Sensing for Voxel Data Compression

Posted by on 26 January 2013 - 11:55 AM

Compressed sensing is not meant to be a compression technique: The whole point is using compression techniques to interpolate from sparse data, but you need to start with a compression technique.


I think a lot of people conflate "compressed sensing" with L1 regularization...'


Taking a step back from L1 regularization, and thinking just about a choice of basis: I wonder how far OP'd get with splitting the grid up into blocks and doing a Hadamard transform of each.  It's more-or-less the direct generalization of JPEG to the "3d binary pixels" case...

#5025670 Most Important Mathematics in Advanced AI/Robotics

Posted by on 25 January 2013 - 10:59 PM

Short post:


Convex optimization.


In reality, there are about four kinds of problems in the world that you can solve, and everything consists of reducing other problems to them:

1.) Linear system of equations

2.) Eigenvector/eigenvalue problems

3.) Convex optimization problems (which really includes 1, and mostly also includes 2)

4.) Low-dimensional dynamic programming


I exaggerate slightly, but not by much.



Long post:


To follow up on alvaro's comment about control theory:


There are a couple of core ideas that a lot of different people study, from slightly different angles, and with slightly different tools.  These people include,

- control theoreticians

- roboticists

- classical AI (planning) people,

- machine learning researchers, and

- operations researchers.


The differences between these groups are as much social, cultural, and historical, as they are technical.


In my own view, there are a small number of core problems that they all aim to solve.  The two that really stand out are,


1.) "Markov decision problems" -- a term that I use broadly to include both discrete time and state spaces (what people normally mean when they say "Markov Decision Problem" (MDP)), and with continuous time and state spaces (which people normally call "optimal control").  Additionally, problems here may or may not contain randomness, and each of these variants have very slightly different corresponding theories, but the underlying ideas are essentially the same.  Do you want to find the shortest path through a maze?  Or how to steer a fighter jet to perform a barrel roll?  Or how to move a robot arm around an obstacle?  These are MDP/optimal-control problems.


2.) State estimation problems ("filtering," "smoothing," and trajectory estimation).  Whereas in my last bullet point the goal was to get a system to a particular state, here the goal is to figure out what state a system is in.  Depending again on whether you are interested in discrete- or continuous- time- or state-spaces, depending on what noise model you assume, and depending on which of these problems you want to solve (filtering, smoothing, trajectory estimation) you end up with e.g. the "forward algorithm" for hidden Markov models, the Kalman filter for linear systems, or the Viterbi algorithm, among other closely-related algorithms.


There are generalizations and combinations of these ideas in all sorts of directions.

E.g., an operations researcher might say that Markov decision problems
are "just" optimization problems with a particular structure and go off
to look at Lagrange multipliers and duality gaps.  He might also point
out all the problems that don't most naturally have this structure.  Likewise, some
machine learning people might say that the idea of different "states"
at different "times," again, "just" describes a particular structure of
Bayes net.  You can also combine #1 and #2 to arrive at POMDPs, which
are the most general (single-player) discrete--time-and-state problem.  But despite all this, I think the two problems I listed above capture the essence of most things.


As for "statistics:" I do not think that there is a single unified intellectual edifice with this name.  The only thing that makes any sense to me philosophically is personalist Bayesian statistics, and even that I'm still figuring out.

#5001108 Group circle collision

Posted by on 14 November 2012 - 10:54 PM

The usual Google terms here are going to be things like "flocking," and "potential functions."

There's a ton written on this topic, and there are many variations. Here's a nice, simple approach:

0.) You have a bunch of units -- aka "agents" -- and each has a position vector in 2d. If you have n agents, call their positions x1, x2, ..., xn.

1.) Define an "interagent potential function." This is a scalar-valued function of two agents' positions that's big when the agents are close and small when far away. You typically pick something that looks vaguely bell-curve-ish. Call it w. I'll give an example later.

2.) You add up these potentials for all pairs of agents, to define a new function
V(x1, x2, ..., xn) = sum_over_all_ij w(xi, xj)
It's something a lot like "potential energy."

3.) You compute a partial derivative dV/dxi, for each i. This is something a lot like "force."

4.) At each step, you push each agent a little bit against the derivative, i.e.
xi[t+1] = xi[t] - eps*dV/dxi(x1,...,xn)
where 'eps' is a small number.

If you pick a smooth enough w, then for sufficiently small "eps," this will converge (if you want to be really fancy, you check to make sure that V has gone down since the last iteration, and, if not, halve "eps" and try again (and so on)).

What you're doing is a lot like a "physics simulation" of "charged particles." The differences are that (1) you're working directly with velocities, not accelerations, and (2) you're not necessarily using 1/r^2 force fields (which give all kinds of nasty divide-by-zero--type problems when agents get too close to each other).

The potential function that I personally like to use I've heard called the "poly6 kernel." As a function of the squared radius between two particles, it is,

w(r^2) = ( r^2 - 1 )^2 if r < 1 and 0 if r >= 1.

I like it because it looks a lot like a Gaussian, because it is exactly zero outside r=1 (so you only need to look at neighbors within that radius), and because you only need r^2 and not r, so you can avoid a square root. But this is just one choice of function; the real key is not what potential you pick, but simply that you define your algorithm in terms of a potential. If you do, you get, "for free," good stability properties.

#5000774 Drawing images along a line

Posted by on 13 November 2012 - 09:38 PM

I have a suspicion that anything reasonable that you do will be plenty fast enough.

The standard/obvious thing to do would be to convert your line segments to quads ahead of time, store them in some kind of spatial data structure (e.g., quad- or KD- tree, regular grid, or just arrays sorted by x and y position), do a conservative cull against your view box, and then and draw what's left with hardware texture mapping if available.

The line-segment--to-quad math, by the way, is pretty cheap. If you have a line from p1 to p2, then the tangent to the line is T=normalize(p2-p1), and the normal is N=J T, where J is the 90-degree rotation matrix (it exchanges the x and y coordinates and flips the sign of one). Then, if the road width is w, and if we define an "offset vector" by d = 0.5 w N, then you can just write the vertices as p1 + d, p1 - d, p2 + d, and p2 - d.

Back to spatial data structures: The only other thing that springs to mind is to exploit temporal coherence and the fact that your world is a graph, by storing neighbor/linkage information in the line segments, maintaining a list of the edges currently intersecting the edge of the view box, and, each frame, only doing intersection tests with those edges and their neighbors, to determine which of these line segments are entering or leaving the view. The assumption this would make is that you can never move more than the length of an edge in a frame. This is probably the fastest method, but also the least flexible, and, since you could probably get away with brute-forcing this, I can't really recommend it, even if it is the most amusing.

#4995290 Production graph search

Posted by on 29 October 2012 - 10:22 PM

This is an interesting question. I need to get some sleep so I'll keep this short (apologies if it's a bit telegraphic), but I hope the following is helpful:

This sounds very similar to max-flow (but not quite identical).

If you have n conversion processes and T time steps, picture a graph with n T nodes. Each node represents doing a particular process at a particular time. Then, you link all the nodes from one time to all the nodes at the next time. I'm picturing a vertical column of nodes for each time, and those columns arranged in order from early times to late times from left to right.

What you're going to do is optimize over flows on the edges.

What are our constraints? Let's look at an individual node. Say you have a process with two inputs and two outputs. The conversion it does is,
INPUT: 2 units of input resource 1 and 3 units of input resource 2
OUTPUT: 4 units of output resource 1 and 7 units of output resource 2.

Say the amounts of input resources you put in are a and b, and the amounts of output resources are x and y. Then, introducing a new variable z to represent the total number of "trades" done, the above is summarized by the constraints,

z <= a/2
z <= b/3
x <= 4 z
y <= 7 z

so you have inequality constraints relating the flows on your edges. Additionally, all the flows need to be nonnegative, meaning that they go the right direction in time:

x >= 0
y >= 0
a >= 0
b >= 0 .

You also have some equality constraints: The flows into the nodes at the far left of the graph (the "earliest" ones) are fixed; these are the quantities you start out with. And the flows at the far right of the graph have "greater-than" constraints, to say that you want produced at least this much of each resource.

Finally, you have a cost, which is just a weighted sum of the flows.

So what does this boil down to?
- You have as many variables as you have edges [EDIT: plus nodes; you also have the "z" variables]
- You have a linear cost (a weighted sum)
- You have linear inequality constraints

You can solve this exactly as an integer linear program, or you can relax the answer to be real (instead of just integer), in which case you just have a standard LP.

Does this formulation leave anything out?

#4990079 minimax with alpha beta pruning implementation stuck.

Posted by on 14 October 2012 - 10:56 AM

I have implementations of multiplayer minimax and there certainly are papers out there going into detail of the issues. The easiest algoriths are paranoid search which basically treats all opponents essentially as one (they are all out to get you) which is the most pessimistic view as apposed to maxn which is the most optmistic method (everyone just tried to max their own score).

This is interesting. "Paranoid search" hadn't occurred to me.

The "paranoid" concept is cool for a couple reasons.

The first is that it turns an N-player game into N separate two-player games. And if your game is zero-sum across all the players, then each of these two-player games will also be zero-sum, and so easy to solve (in mixed strategies, when turns are simultaneous).

The second is that, with this solution concept, we can understand "how we got there." With Nash equilibrium, it's unclear how you actually arrive at an equilibrium, or which equilibrium you arrive it. You've got things like Condorcet dynamics and "fictitious play," but these are kind of unsatisfying; there's no great reason to expect players to use these mental processes.

One question to ask about "paranoid" solutions is how "consistent" they are. Are the solutions you get even Nash equilibria? I expect not. Whereas I think(?) that maxn solutions are Nash equilibria. (But if not -- do we care? Why do we want Nash equilibria in the first place?)

#4989530 [matlab] vector-valued functions...

Posted by on 12 October 2012 - 11:34 AM

Oh! I think I understand why you're doing what you're doing. You've been getting the error message,

??? Error using ==> lsqfcnchk at 117
FUN must be a function or an inline object;
or, FUN may be a cell array that contains these type of objects.

so you've been thinking FUN can be a cell array. Then you type "help lsqnonlin," and you read

X = LSQNONLIN(FUN,X0) starts at the matrix X0 and finds a minimum X to
the sum of squares of the functions in FUN.

and that phrase "the functions in FUN" leads you to believe that there are multiple functions.

All this is extremely misleading. FUN is one function. It returns a vector. Why the MATLAB documentation confuses the issue, I do not know.

Here's a correct version of what I think you were attempting with your last code:
% answer.m
function answer()
	[x, resnorm] = lsqnonlin(@residuals, 10);
	fprintf('x = %g\nresnorm = %g\n', x, resnorm);
function r = residuals(x)
	for i=1:3
		r(i) = x^i;

Note that the answer is trivial. We're looking for the number "x" that minimizes (x^1)^2 + (x^2)^2 + (x^3)^3, and that number is of course zero. Note also that "residuals" can be written more efficiently than above as,
function r = residuals(x)
	r = x.^[1:3];
and that, if you really want, you can do everything with the one-liner,
[x, resnorm] = lsqnonlin(@(x)(x.^[1:3]), 10);

Finally, I think you may also be a bit confused about cell arrays. The syntax, "opt = {3};" does not declare a cell array of size 3x1 or 1x3. It creates a cell array of size 1x1, containing the value 3. For an example of what the curly brackets do in this context, consider the statement "opt = {[1;2], eye(2), 1};" this makes a 1x3 cell array containing a 2x1 vector, a 2x2 matrix, and a scalar.

I hope that helps.

#4988537 Rendering Magnetic Fields

Posted by on 09 October 2012 - 06:05 PM

Is your game 2d or 3d? And are you familiar with vector notation?

For the magnetic field of a dipole in 3d, check out the Wikipedia article here. I've linked to a particular section; see the equation for B(x) there. This gives you your magnetic field.

There are a few ways to draw magnetic fields; we can help you with them. Some options are,

Draw a vector field (2d/3d): For every point in a grid, draw an arrow with the right direction and magnitude)
It will look something like this:
Posted Image

Draw field lines (2d/3d):
Method 1 (using the vector field itself): From a collection of starting points (usually also in a grid), walk forwards and backwards along the vector field, drawing a curve.
Method 2 (in 2d, using the vector potential): Use marching squares/triangles to draw level sets of the vector potential.
It will look something like this:
Posted Image

Plot the magnetic potential (2d):
In 2d, the magnetic potential is a scalar field. You can either map these values to grayscale, or to a color scale, and then just draw this image.
It will look something like this:
Posted Image

#4985556 [Answered] Good math library for OpenGL

Posted by on 30 September 2012 - 05:56 PM

I absolutely love Eigen. That said, it's more of a general-purpose linear algebra library than a "graphics math" library. Still, the syntax is hard to beat. Beautiful template magic.

#4984712 Advanced pathfinding with robot in 3D

Posted by on 28 September 2012 - 06:57 AM

I'm with snowman. As a starting point, I always direct people to Steve LaValle's book. Another very good book is this one.

The state of practice for robot planning is essentially this:

1. Use an RRTto generate some path -- any path -- in the configuration space (i.e., space of joint angles) from start to goal. The resulting path will not be optimal in any sense, but it can be computed quickly and it gets from point A to point B, which is the most important thing.

2. Smooth the path. There are varying strategies for this, and there is a little less consensus about the details, but the gist is that you do some combination of throwing out intermediate points in the path, and running splines (usually cubic) through the remaining points. In lower dimensions, this is what algorithms like "string pulling" aim to solve.

3. Write a feedback controller to follow the path. The standard methods (which are very similar) are "inverse Jacobian" and "Jacobian Transpose" controllers (the first reasonable Google hit is here; I also particularly like this book.)

There are some variations for the various items above, especially for part 1: Along similar lines to RRTs, there are also algorithms like RRG* that attempt to grow a random graph, and attempt to converge to a path that is actually optimal. This paper by Frazzoli's student should point you in the right direction (see the references). There are also some deterministic planners that people have had some success with; these do a combination of graph search and graph refinement (in the style of hierarchical A*).

#4917393 Cybernetics and games?

Posted by on 28 February 2012 - 08:52 AM

A lot of academic words are more about group identity and connotation than hard meaning and denotation.

I think "Cybernetics" is one of them.

"Cybernetics" was coined by Norbert Weiner back in the day, and I think that the intellectual descendents of Norbert and friends have mostly abandoned the term. I think the word has been mostly replaced by "Mathematical Systems Theory" and "Control theory." One of the reasons the phrase "Cybernetics" has fallen out of favor is that it was overhyped; it fell prey to buzzword inflation. You had snake-oil "consultants" telling businesspeople they could "use cybernetics" to analyze their businesses and create efficiencies -- at which point the word lost much of its value. That variety of "cybernetics" probably looked to controls people a lot like how "Six Sigma" (a mathematical term turned business buzzword) must appear to statisticians today.

Most of the real "meat" attached to the word "Cybernetics" is work studying systems of Ordinary Differential Equations from a graph/network perspective. That certainly has value, and I think every engineer should have an appreciation for feedback.

#4911859 Colliding vs an implicit distance field

Posted by on 10 February 2012 - 08:36 PM

However if you use it to calculate the distance to a surface from an arbitrary point, it will fail in many cases -- basically any time the point is inside any of the volumes.

I thought most of the formulas on the page you referenced were for signed distance -- so, they're positive when you're outside, negative when you're inside, and the absolute value is the distance to the surface.

I don't really understand what you meant by "(You would not have this luxury if your surfaces were the zero-level sets of some other functions besides distance fields.)"

My point was just that the sphere is colliding with the zero-level set if and only if the center is a distance 'r' from the level set. Since "How far away am I?" is exactly what a distance field returns, you just need to check whether it evaluates to a number less than r. But what if your function isn't a distance field? You can imagine other functions that have the right zero-level sets, but that DON'T tell you the distance to the set when you're at other points; (when you're not on the zero level set, it can return any other number, so long as it's nonzero).

Let's consider a simple 1d example. Say our set is the interval [-1,1]. The signed distance to this set is just

d(x) = abs(x)-1 .

Another function that also evaluates to zero at the same points is,

f(x) = x^2 - 1

but the number it returns isn't the distance to the set. E.g., the point 2 is d(2)=1 away from the set, not f(2)=3 away from the set.

Does this clear things up, or am I not answering your question?