Solving Systems of Equations

Started by
15 comments, last by TheAdmiral 17 years, 7 months ago
Okay guys. Seems I am having another bit of trouble reasoning this one out. I have 100 different data points. Each one has 20 parameters, and 1 result. I am trying to find the best correlation between these parameters and the results. It is possible that some of these parameters have no input on the result. I think I can best identify the equations as: (P1 is parameter 1, P2 is parameter 2, etc) (a)*P1^[a1] + (b)*P2^[b1] + ... + (t)*P20^[t1] = N Example: a*15^[a1] + b*5^[b1] + ... + t*16^[t1] = 142.7 Now, I can fill in P1 -> P20 for all datapoints. I know N for each equation. Is it possible for me to solve for all the other variables? There seem to have 40 independent variables to me -- and I have well over 40 datapoints (100, to be exact). This should be more than enough to solve it, right? Can I write this programmatically (probably using a language like SML)? I am a bit rusty on my math. Initially I thought the problem didn't have the ^[a1] -> ^[t1] and I was just going to use a substitution method to solve for a->t. But now with the ^[a1] through ^[t1], I think I am out of my league. Any ideas? Thanks Corey
Advertisement
Hmm, you have 1 equation, and 40 unknowns, so i'm pretty sure you need 39 other equations to be able to solve it.
I'm having trouble understanding what all of your variables mean. So P1, P2, etc. are the parameters for a given data point, correct? What do a-t and a1-t1 represent? If you can somehow simplify these equations into first-degree polynomials, you can solve them using a matrix.
The other equations would be filling in the data with the 100 other datapoints I have.

Must they be distinctly different equations?
IE:

5a+3b+7c = 10
10a+7b+10c = 18
6a+14b+3c = 13

Theoretically, those are solvable. The datapoints were (5,3,7,10), (10,7,10,18) and (6,14,3,13).

Is that any different than:

a5^[a1]+b3^[b1]+c7^[c1] = 10

except I would use six equations now.

Again, I am rusty with my math. But is it solvable?

I mean, take a simply example:

a5^[a1] = 2
a7^[a1] = 6

In this case, the datapoints are (2,5) and (7,6). The equations are identical.

a = 2/(5^[a1])

(2/(5^[a1])(7^[a1]) = 6
7^[a1] / 5^[a1] = 3
7/5 = 3^(1/a1)
ln(7/5) = (1/a1)ln3
a1 = ln3 / ln(7/5) = 3.265

Then, I go back and solve for a (it ends up being .01044)...

Can I do the same if I have 40 unknowns with 40 equations instead of 2 unknowns and 2 equations? I should be able to, right?

a,a1 through t,t1 are simply 'unknown' values. I just have a ton of data points and I believe they are some how correlated. I don't know how, but I want to see if I can crunch them out.

Now, my biggest issue is this: I have no idea what the shape of the equation may be. All the points should have been crunch with the same equation, but I have no idea what that equation is. For all I know, the equation could be 1/a = N, where a1 through t1 are never actually used. However, since I do not know the shape of the equation, I was hoping I could create a sort of "best fit" equation using this method.

I think I could write up some code that could solve it for me ... but I would have to hear from you guys that this is actually solvable first and that I am not wasting my time completely.

Is there maybe a way I can create a 'best fit' curve more efficiently, even if I don't know what curve to apply?

Is there any way I can get a computer to crunch this for me? Am I off my rocker?
If you were dealing with a simple system of equations, such as...
Quote:5a+3b+7c = 10
10a+7b+10c = 18
6a+14b+3c = 13

... then I would say yes. But you're not; you're having a variable as an exponent, and that complicates things. My instinct says that it is still solvable, but not easily. The two variable case is simple, but using four variables complicates things greatly, since the bases of each term are probably different. I would recommend resorting to an approximation technique. Unfortunately, I don't know where you should start looking. :\
just trying to find out what you're doing:

you say you have 100 or so data points which you know are the result of an equation of the form a_0 * p_0 + a_1 * p_1 + ... + a_n * p_n = N where n is the number of parameters. a_0,...,a_n and N are known constants. you need to find the p_0, ..., p_n that fulfill the equation for each data point.

is that what you're trying to do?
Close.

a_0 * p_0 ^ a_1 + ... + t_0 * p_n ^ t_1 = N

Except I don't know if that was the equation used. I am just trying to "force values" into a_0 -> t_1 to find a "close enough" fit.

Well, another way to approach this may be better. (I'm just thinking here, since I haven't done anything like this in a while, and definitely nothing so big.).

If I understand correctly you have a black box that takes 20 inputs (P1-P20) and generates 1 output (N). You've done this 100 times so you have 100 sets of inputs and the resulting output for each set.

To fit data to a curve (1 input and 1 output), you can use a polynomial. y = a + b*x +c*x^2 + ... You can increase the order to get a better fit (e.g add d*x^3, etc). You then need to solve for a,b,c,etc.

For a MISO system (many input,single output), I guess you could have something like this.

y = a1 + b1*x1 +c1*x1^2 + ...  + a2 + b2*x2 +c2*x2^2 + ...  + a3 + b3*x3 +c3*x3^2 + ...  + ...

So for your case,
N = a1 + b1*P1 +c1*P1^2 + ...  + a2 + b2*P2 +c2*P2^2 + ...  + a3 + b3*P3 +c3*P3^2 + ...  + ...  + a20 + b20*P20 +c20*P20^2 + ...


So the number of unknowns to solve for depends on the complexity of the equation you want.
You'll have 20 unknowns times the (order + 1). So if you choose second-order equations, you'll have 60 unknowns (which should be fine since you have 100 data points).

Now to solve it, I don't know. This seems simpler than the form you originally suggested, and I think this is more of the standard approach. So you might be able to search some on curve fitting for MISO systems.

Hmm, a system can also be represented by
y = Ax + B[y1 ]   [a1  b1  c1  ... t1 ][x1 ]   [u1 ]|y2 |   |a2  b2  c2  ... t2 ||x2 |   |u2 ||y3 | = |a3  b3  c3  ... t3 ||x3 | + |u3 ||...|   | ...               ||...|   |...|[y20]   [a20 b20 c20 ... t20][x20]   [u20]

Then maybe solved using control theory stuff, but I'd have to look into that more too to see if that would work.

Got sidetracked at work - so I'll post this and maybe look more later if I can. G'luck

[Edited by - reana1 on September 12, 2006 1:58:35 PM]
Tadd- WarbleWare
So you have some unknown equation that has given you 100 points. Each point has 41...variables(?)[20 "[letter]", 20"[letter(power)]", and 1 N]?

And you're trying to figure out if you knew the equation, would it be solvable?

And if it is, is there some way of getting a rough estimate of that equation?

EDIT:^^...what he said^^ :-)
if(this.post == SATISFYING){money.send(1.00,&HemoGloben);return thanks;}
No offence, visage, but your method stinks [wink].

Throwing twenty (was 20 chosen at random?) unknowns into exponent positions for 100 highly nonlinear equations in god knows how many unknowns. You are guaranteed neither existence nor uniqueness (so analytical solution went out the window before you started) and your processor would have forty fits. Don't fret though, we'll start again:

First, a little linear algebra:
Quote:Rank-Nullity Theorem:
Given m linearly-independent linear equations in n variables;
at least one solution exists if m >= n,
at most one solution exists if m <= n.

(linearly independent meaning that you cant make any of the equations by adding together constant multiples of the others)

This is where people get their '3 equations to solve for 3 unknowns' rule from. But always remember the rules: The equations must be linearly independent (else you could simply multiply one of your equations by 2 and have a 'new' one) and the equations must be linear. That means no ², ³, sqrt, ^, exp, sin, cos or anything else fun.

So if you want a reliable solution, you'll need exactly as many linear equations as you have unknowns. Since you know nothing of the nature of the curve you are fitting, this is a tall order. Assuming first that the problem can be reduced down to finitely many linear equations, you'd not only have to determine what that number is but also how to derive the problem set. So that pretty much rules out linear algebra. Now, barring out some very high-power group theory and number theory, any hope of devising an analytic solution (i.e. determining the answer exactly by solving rational equations) is completely gone, unless you are presented with new evidence. So lets put that behind us and never look back.

What's left then? Numerical methods. I don't much like them (and most other 'real' mathematicians despise them), but considering that your solution will be tainted by floating point error when it is completed anyway, why get strung up over arbitrary accuracy?

From what I gather, you have 100 nodes and you want a nice smooth path to fit through (or very close by to) all of them. Can you perhaps tell us where the nodes came from? Are you sure that they were generated by formula or could they have been sampled from something less predictable, like the stock market? It doesn't make too much difference, but you may want to take slightly different approaches each way.

Now, there are 'cheating' methods, where you contrive a function that fits through the points, but these all have their problems.

Perhaps the easiest way to produce a function to pass through N nodes is Lagrangian Interpolation. The catch is that it is only really guaranteed to do just that; anything else is a huge bonus. The idea is to create a polynomial of order N and use its degrees of freedom one by one to prise the curve through the points, one by one. As much as it gets the job done, you'll find that as N increases, the polynomial gets really quite wild: The result us a high-frequency, high-amplitude wave that just happens to pass through all of your points around its equilibrium level. After 100 degrees of jimmying, I can imagine that the values in between your nodes will deviate incredible distances (we're talking several orders of magnitude), which probably isn't what you want.

So alternatively, you could use cubic B-splines to produce a beautiful curve that passes through each node with effortless grace, doing a magnificent job of interpolating as it goes. Only, the equation for that is defined piecewise, so you can't really write it 'on paper' very easily, and any attempts to extrapolate values out of your known range will quickly go pear-shaped. This is one to look out for though, if you don't necessarily need to recover the original function.

So barring such reconstructive methods, we're left with some elementary problems in representation theory:

If you ever studied complex analysis (or perhaps even real analysis) then you'll know that any holomorphic function (which is just about all the interesting ones, certainly the one you're after) can be represented perfectly as a power-series. Anything that isn't holomorphic will need a similar construct - the Laurent expansion - but that's a story for another day.

Another note to make is that the higher order terms of a nice-looking power series tend to be fairly insignificant, so a truncated power series (i.e. a big polynomial) will do the trick in this case.

The method I'm about to outline (Least Squares Approximation isn't dissimilar at all to what you were trying to do, but a few key concepts will be involved to turn the method from an analytical to a numerical one.
The maths may look long and scary at first, but if you're feeling confident, you'll be a much better man for knowing how to work LSA:

The rough idea goes:

1. Write the approximating function in an appropriate manner:

f(x) = a + b.x + c.x² + d.x³ + ... + cw.x^100

or maybe something like

f(x) = a.x^(-5) + b.x^(-4) + ... + f + g.x + ... + db.x^95

if you suspect that some reciprocals may be involved. It doesn't matter too much, but you can boost your accuracy if you choose the right format.

2. Create an expression for how 'different' the above equation (the coefficients are filled out) is to the ideal curve that fits your data. There are many examples out there and a simple area-calculation integral on x performs well. You'll either have to adapt the documented method or calculate this reference curve first, which seems like it defeats the object but the reference curve can be piecewise defined like by connect-the-dots or any splines method.

3. Decide upon an inner-product and use it to split the big long expression into 100 smaller, simpler ones and integrate up.

4. Now you can treat these 100 expressions (which just so happen to be linear) as equations in your coefficient set. Going back to the linear algebra discussed earlier (who said we'd ruled it out? [rolleyes]), we can thrown the system into a matrix, Gaussian eliminate it and back-substitute for the solution. This is just a fast (well, cubic time) computer-friendly way of doing the substitution method for simultaneous equations we're all so familiar with.

5. Once the coefficients are calculated, plug them into the expression in (1) and, next thing you know, you'll have a marvellously accurate curve to fit your data with provably minimal error.

One of the nicest things about this method is that if you picked your function correctly in (1) then you'll get exactly the correct answer. For example, if your nodes were generated using the function 'g(x) = x^2 - 4' and you used either of the example representations in (1), you'll find that the LSA result nullifies most of the coefficients out to zero and the other two to 1 and -4, giving exactly the right answer. If the algorithm takes too much time to process, you can simplify the polynomial and reduce your dataset, trading off some accuracy for speed.

Well, that's me done. I apologise for the long post, but hopefully somebody will get some use from it.

Regards
Admiral
Ring3 Circus - Diary of a programmer, journal of a hacker.

This topic is closed to new replies.

Advertisement