# Finding Best Fit Lines Algorithm

This topic is 3762 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

So I have a basic system set up where I have N (x,y) points in an array. My goal is to find all linear regressions with r^2 greater than 0.95 of these points. For a naive implementation, I did something like this:
#points is our array of points.  n = points.size
1.upto(2**n-1) { |i|
s = i.to_s(2)
s = '0'*(n-s.size)+s

included_pairs = []
(s.length-1).downto(0) { |j|
included_pairs << points[j] unless s[j] != 1
}

#perform linear regression on included pairs
}


##### Share on other sites
def test_regression(major, minor)        if minor == 0            return []         end	s = (major | minor).to_s(2)	s = '0'*(n-s.size)+s		included_pairs = []	(s.length-1).downto(0) { |j|		included_pairs << points[j] unless s[j] != 1	}				### do linear regression here				if worked		### here we fork -- continuing on two paths		return [r_squared, etc] + test_regression(major | minor, minor>>2) + test_regression(major, minor>>1)	else		return test_regression(major, minor>>1) 	endendworking_regressions = []n.downto(1) { |i|	j = 2 ** (i-1)		major = j	minor = j/2		working_regressions = working_regressions + test_regression(j, j/2)}

maybe? does this look right (besides the scoping issues with n)? testing this is fucking impossible.

[Edited by - visage on July 30, 2008 10:02:50 AM]

##### Share on other sites
I don't understand your problem. What are you trying to do?

##### Share on other sites
Figured it out!

n = points.sizevalid_trends = []		def test_regression(points, major, minor, n, type, date)	delta = 0.025			if minor == 0		return []	end			s = (major | minor).to_s(2)	s = '0'*(n-s.size)+s		included_pairs = []	0.upto(s.length-1) { |j|		included_pairs << points[j] if s[j].to_i == 1	}				#### test linear regression here	if !invalid		### here we fork -- continuing on two paths		#puts "\t\tValid: #{est[0]}x + #{est[1]} @ #{start}"		return  [[r, est, start, included_pairs.map { |e| e[0] }]] + 			test_regression(points, major | minor, minor>>1, n, type, date) + 			test_regression(points, major, minor>>1, n, type, date)		else			return test_regression(points, major, minor>>1, n, type, date) 		end	endendif n > 1	j = 2**(n-1)	while j > 0				valid_trends = valid_trends + test_regression(points, j, j/2, n, type, date)		j = j >> 1	endend

I think that is the solution...it seems to work in empirical tests of the data...

Basically, I was trying to derive an equation whose expected runtime was less than 2^n (though worst case was still 2^n) by using logic to cull out possibilities that couldn't occur...

1. 1
Rutin
36
2. 2
3. 3
4. 4
5. 5

• 11
• 15
• 12
• 14
• 9
• ### Forum Statistics

• Total Topics
633352
• Total Posts
3011483
• ### Who's Online (See full list)

There are no registered users currently online

×