# Non linear regression (data fitting)

This topic is 4492 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi... I want to fit a non-linear function in the form :
f(x, k1, k2, k3) = k1^2 * g(x, k2) + k3

to some data. I tried Levenberg - Marquardt method (as it appears in Numerical Recipes) but the results aren't good enough. I guess the problem is at the initial guess values for k1, k2, k3, because when i generate "experimental" data from the above function and the try to fit it to them, the results are very different depending on the initial values. When i say "experimental" data i mean that i evaluate the function at some (2048) points and i add some small random noise to them. If i choose initial values near the correct values, then the algorithm finds the exact values. Otherwise it returns completely incorrect results. Is there any method that can converge even with completely wrong initial values? I know that this depends on the function (how many local mins/maxs it has, where it has them, etc.). What method do you prefer / suggest for robust results? Note that the curve can't be linearized (it contains cosines; too many of them (and complex) for expanding them to series) so it has to be a nonlinear method. Thanks in advance. HellRaiZer

##### Share on other sites
You probably have several local minimums. If you can model your data as f(x,k1,k2,k3) + n, where n is some random variable representing noise _and_ if you know approximately the variance of the noise, then you can deduce whether or not you are close to the global minimum from the mean squared error (MSE).

If the minimum is found and the MSE is much larger that you have estimated, then you choose new values for k1,k2 and k3 and start over. (not perhaps the most sophisticated method but beats total enumeration :))

##### Share on other sites

Quote:
 You probably have several local minimums. If you can model your data as f(x,k1,k2,k3) + n, where n is some random variable representing noise _and_ if you know approximately the variance of the noise, then you can deduce whether or not you are close to the global minimum from the mean squared error (MSE).

I can't do that for one reason. In case to calcualate the variance of the noise you mention i need more than one sample at the exact same time and the exact conditions. I don't have that. The procedure that generates the data runs in real-time, meaning that i have a complete list of samples (2048) every 0.5 secs and the total time is about 15 secs. So i can't have, not even approximately, two or more samples at the same time.

Btw, something i forgot to mention because i thought it was obvious. The variable in the above function is x. k1, k2 and k3 are the parameters which i have to calculate in case to fit the curve to the data.

Quote:
 If the minimum is found and the MSE is much larger that you have estimated, then you choose new values for k1,k2 and k3 and start over. (not perhaps the most sophisticated method but beats total enumeration :))

What you describe is more like a brute force approach, which i have already considered, but not implemented, in the hope to find something more sophisticated and fast. The selection of the new k1, k2, k3 could be done using a deepest-decent-like method (local optimization) (avoids complete enumaration), but it will be too slow and there a chance the get stuck at a local minimum.

Because the whole thing runs at the above mentioned rate (2048 samples per 500 msec), i was thinking about an algorithm that can run in parallel. This means in real-time. It doen't have to be that way (i can fit the data after the procedure finishes) but if it was, it would be better.

Thanks for the link, and please correct me if i haven't understood you point correclty.

HellRaiZer

##### Share on other sites
Would you mind telling us what g(x,y) is and what the data that you are dealing with is? It might be easier to recommend a method if we know a little more about the situation.

Also, if you could put one of those sets of 2048 samples somewhere for download, maybe we can even test a little. I would do it just for fun.

I'll probably try simulated anhealing, or something inspired by genetic algorithms that I have been thinking about lately.

##### Share on other sites
Quote:
Original post by HellRaiZer

Quote:
 You probably have several local minimums. If you can model your data as f(x,k1,k2,k3) + n, where n is some random variable representing noise _and_ if you know approximately the variance of the noise, then you can deduce whether or not you are close to the global minimum from the mean squared error (MSE).

I can't do that for one reason. In case to calcualate the variance of the noise you mention i need more than one sample at the exact same time and the exact conditions. I don't have that. The procedure that generates the data runs in real-time, meaning that i have a complete list of samples (2048) every 0.5 secs and the total time is about 15 secs. So i can't have, not even approximately, two or more samples at the same time.

I don't follow. You say you have 2048 samples, but you still can't use more than one sample at a time? Do you mean that you're streaming the data with strict latency requirements?

So infact you need an online algorithm?

But still, if you can assume that the fitting is correct for the past samples you can always estimate the error variance (MSE) with running variance and assume that the variance is stationary or atleast changes slowly.

Quote:
 Btw, something i forgot to mention because i thought it was obvious. The variable in the above function is x. k1, k2 and k3 are the parameters which i have to calculate in case to fit the curve to the data.

Yes.
Quote:

Quote:
 If the minimum is found and the MSE is much larger that you have estimated, then you choose new values for k1,k2 and k3 and start over. (not perhaps the most sophisticated method but beats total enumeration :))

What you describe is more like a brute force approach, which i have already considered, but not implemented, in the hope to find something more sophisticated and fast. The selection of the new k1, k2, k3 could be done using a deepest-decent-like method (local optimization) (avoids complete enumaration), but it will be too slow and there a chance the get stuck at a local minimum.

I wouldn't go so far as to saying that it is a brute force algorithm. The selection of the new guess for k1, k2 and k3 can be made by looking at the trajectory of the solution (k1,k2,k3) when it hit the local minimum. So if the selection (k1,k2,k3) is done "optimally" the worst case scenario has to run the optimizer as many times as there is local minimums.

Quote:
 Because the whole thing runs at the above mentioned rate (2048 samples per 500 msec), i was thinking about an algorithm that can run in parallel. This means in real-time. It doen't have to be that way (i can fit the data after the procedure finishes) but if it was, it would be better. HellRaiZer

OK.

##### Share on other sites
Hi...

alvaro:
Quote:
 Would you mind telling us what g(x,y) is and what the data that you are dealing with is? It might be easier to recommend a method if we know a little more about the situation.Also, if you could put one of those sets of 2048 samples somewhere for download, maybe we can even test a little. I would do it just for fun.I'll probably try simulated anhealing, or something inspired by genetic algorithms that I have been thinking about lately.

Thanks for the interest. A little description first for the whole thing.
The process for generating the data is a physical process, a dissolution. It measures a physical quantity of a sample (its depth) based on the amount of light reflected off its surface. The sample's depth changes continually and the machine that measures the reflected light, collects data all the time. But in case to have a complete view of the process we have to analyze all the wavelengths involved (from about 300nm to 1100 nm). The machine gathers data for (approx.) 15 msec for a single wavelength, so in case to have a complete set it takes about 500 msec. The rest of the 2048 wavelength are calculated using linear interpolation (i think so at least; this is done internally by the machine). That's why i said i can't have two or more samples at the same time and the same conditions.

Here are the complete equation and a sample data file.
If you can make something from it, please post it here. The equation is in a word document (obviously) using Math equation 3.0.

Quote:
 I don't follow. You say you have 2048 samples, but you still can't use more than one sample at a time? Do you mean that you're streaming the data with strict latency requirements?

Sorry but my english are really bad. As i said above, i have 2048 data points (x, y) but each data point isn't measured at the same exact time as the rest.

Quote:
 So infact you need an online algorithm?

I thought it could be better this way, but if i can't find an online one, i'll have to fit the data off-line after the whole procedure finishes.

Quote:
 But still, if you can assume that the fitting is correct for the past samples you can always estimate the error variance (MSE) with running variance and assume that the variance is stationary or atleast changes slowly.

The key phrase here is "if you can assume that the fitting is correct for the past samples". If this holds then you are correct that i can use the previous values as guesses for the new fitting. But the problem is how to get those first-time fitting parameters. If i had them, i think every fitting algorithm would perform fine. The problem is with the first time.

I still haven't read the MathWorld page. I'll check it now, and i'll come back with possible questions.

Thank you for helping. I hope it will be solved eventually!

HellRaiZer

##### Share on other sites
Do you know in what range d1 (a.k.a. k2) is likely to be? Is a value of -800 reasonable, for instance?

Can you check if this code computes the right thing? If not, can you fix it and post it?
double Q[4]={1.003,1.4800,1.4537,3.0166};double W[4]={0.000,0.003577,0.00342,0.4281};double S[4]={0.000,0.000170,0.000,0.000};double d2=1044;double Four_Pi=16*atan(1.0);double n[4];double r[4][4];void update_n(double inv_lambda){  double inv_l2=inv_lambda*inv_lambda;  double inv_l4=inv_l2*inv_l2;    for(int i=0;i<4;++i)    n=Q+W*inv_l2+S*inv_l4;}void update_r(double inv_lambda){  update_n(inv_lambda);    for(int i=0;i<4;++i)    for(int j=0;j<4;++j)      r[j]=(n-n[j])/(n+n[j]);}double f_over_g(double lambda, double d1){  double inv_lambda=1.0/lambda;  double FPIL=Four_Pi*inv_lambda;    update_r(inv_lambda);  double r012=r[0][1]*r[0][1];  double r122=r[1][2]*r[1][2];  double r232=r[2][3]*r[2][3];    double c1=cos(FPIL*n[1]*d1);  double c2=cos(FPIL*n[2]*d2);  double c1p2=cos(FPIL*(n[1]*d1+n[2]*d2));  double c1m2=cos(FPIL*(n[1]*d1-n[2]*d2));    double g =    1.0 + r012*r122 + r012*r232 + r122*r232 +    2*r[1][2]*r[2][3]*c2*(1+r012) +    2*r[0][1]*r[2][3]*c1p2 +    2*r[0][1]*r[1][2]*c1*(1+r232) +    2*r[0][1]*r122*r[2][3]*c1m2;    double f =    r012+r122+r232+2*r[0][1]*r122*r[2][3]+r012*r122*r232+    2*r[0][1]*r[2][3]*c1p2+    2*r[1][2]*r[2][3]*c2*(1+r[0][1]*r[2][3])+    2*r[0][1]*r[1][2]*c1*(1+r[0][1]*r[2][3]);    return f/g;}

##### Share on other sites
Hi again...

Quote:
 Do you know in what range d1 (a.k.a. k2) is likely to be? Is a value of -800 reasonable, for instance?

No. d1 is the depth of the dissolved surface, so it must be greater or equal to 0. For the data you have (which corresponds to the beginning of the process) it must be greater than 0, but i don't know an exact upper bound. It must be under 1000.
Also 'A' must be greater than zero. If you prefer (and if you don't want to put more constraints to the system) you can replace it with e.g. B = A^2, and B can take any value.

I made some corrections to the source. Q[0], S[0] and W[0] have changed (i found out that they were wrong after i uploaded the .doc). One more thing changed is the calculation of n. See the source (multiply with 10e6 and 10e12 the two additional terms). Sorry for the mistakes :)

double Q[4]={1.4930,1.4800,1.4537,3.0166};double W[4]={0.004864,0.003577,0.00342,0.4281};double S[4]={0.000071,0.000170,0.000,0.000};double d2=1044;double Four_Pi=16*atan(1.0);double n[4];double r[4][4];void update_n(double inv_lambda){	double inv_l2=inv_lambda*inv_lambda;	double inv_l4=inv_l2*inv_l2;	for(int i=0;i<4;++i)		n=Q+W*inv_l2 * 10e6 + S*inv_l4 * 10e12;}void update_r(double inv_lambda){	update_n(inv_lambda);	for(int i=0;i<4;++i)		for(int j=0;j<4;++j)			r[j]=(n-n[j])/(n+n[j]);}double f_over_g(double lambda, double d1){	double inv_lambda=1.0/lambda;	double FPIL=Four_Pi*inv_lambda;	update_r(inv_lambda);	double r012=r[0][1]*r[0][1];	double r122=r[1][2]*r[1][2];	double r232=r[2][3]*r[2][3];	double c1=cos(FPIL*n[1]*d1);	double c2=cos(FPIL*n[2]*d2);	double c1p2=cos(FPIL*(n[1]*d1+n[2]*d2));	double c1m2=cos(FPIL*(n[1]*d1-n[2]*d2));	double g =		1.0 + r012*r122 + r012*r232 + r122*r232 +		2*r[1][2]*r[2][3]*c2*(1+r012) +		2*r[0][1]*r[2][3]*c1p2 +		2*r[0][1]*r[1][2]*c1*(1+r232) +		2*r[0][1]*r122*r[2][3]*c1m2;	double f =		r012+r122+r232+2*r[0][1]*r122*r[2][3]+r012*r122*r232+		2*r[0][1]*r[2][3]*c1p2+		2*r[1][2]*r[2][3]*c2*(1+r[0][1]*r[2][3])+		2*r[0][1]*r[1][2]*c1*(1+r[0][1]*r[2][3]);	return f/g;}

Because i couldn't find a good solution to the whole problem (i tried a commercial program too, GraphPad Prism, but with no luck) i asked the guy who performed the experiments in case to see if there is something wrong about the data. I'll have an answer tomorow (it's 21:00 here now, so i'll have to wait a bit). I'll post his answer here as well as new data if those were wrong. I'll also check the equation to confirm it's the correct one, in case to eliminate all potential problems.

Once again, thanks a lot for the help and interest. I'll keep you informed.

HellRaiZer

##### Share on other sites
I tried with the corrected code, and it looks like the data doesn't match something of the form A*f/g+C at all. I suspect there are still errors in the formula... :(

##### Share on other sites
You need something a little more powerful than regressing to a curve. Your problem sounds like an ideal one for particle filtering. It's ideal for determining the hidden parameters of a system given a noisy nonlinear model of the system and noisy indirect measurements over time. Linkowski sum

1. 1
Rutin
24
2. 2
3. 3
JoeJ
18
4. 4
5. 5

• 38
• 23
• 13
• 13
• 17
• ### Forum Statistics

• Total Topics
631706
• Total Posts
3001833
×