Evaluate my float comparison strategy, plz

Started by
35 comments, last by Nice Coder 19 years, 3 months ago
I have been fighting for quite some time with epsilons, and I have been hating the ambiguity of them - one kind of epsilon would have no impact on numbers too large, while having an enormous impact on smaller numbers. What I have tried, is to scale my epsilon to match the numbers under comparison. This is what I came up with:

						// 0.0000001% within each other
#define EPSILON_DEFINER				10000000.0f;

inline bool FloatCmp(float a_value1, float a_value2)
{
	// sort the values
	SortValues(&a_value1, &a_value2);
	// then see if the biggest value is within decent range of the lowest
	if(fabs(a_value2) > fabs(a_value1)*10.0f)
		return false;

	float l_EPSILON;

	// guard against division with 0
	if(a_value1 == 0.0f)
		l_EPSILON = 0.0000001;
	else
		// ok, they are in the same range, create a decent EPSILON
		l_EPSILON = fabs(a_value1) / EPSILON_DEFINER;

	return (a_value2 - a_value1 < l_EPSILON);
}


template<class Item>
void SortValues(Item* a_value1, Item* a_value2)
{
	if(*a_value2 < *a_value1)
	{
		Item l_valuebuffer = *a_value1;
		*a_value1 = *a_value2;
		*a_value2 = l_valuebuffer;
	}
}







Edit: This is an altered version compared to the original post, the first was flawed, but this should work. If needed, EPSILON_DEFINER can be set to something else, if you want to have a hit on numbers further from each other - or if you want even more precision when 64-bit floats come along. Another way could be to make a global, that you allow the program to initialize upon startup... Ok, this is what I do: I start by sorting my input values (a_value2 will be the biggest). Then I check if they are within decent range of each other (I multiply smallest number by 10 - if large number is still larger, then they are obviously not equal. If they are in the same range, then I create an EPSILON based on their size, before I make the standard test (notice, I know that a_value2 is the largest, since I sorted it before, therefore I don't need to fabs the subtraction). Do you see any flaws? And how do you evaluate the costs? We are talking 3 comparisons (one inside SortValues()), one multiplication, one division, one subtraction. Too expensive? (Notice that SortValues() is declared inline in its header file, so there is no overhead to the function call). [Edited by - Mercenarey on January 1, 2005 11:24:46 AM]
Quote:CalvinI am only polite because I don't know enough foul languageQuote:Original post by superpigI think the reason your rating has dropped so much, Mercenarey, is that you come across as an arrogant asshole.
Advertisement
First I have to say, that I'm not sure what you exactly want. It seems to me like you want a sort of generic float comparison function, but why?

If you are looking if the temperature on two locations are the same, then you might want to call 0.0 and 0.2 equal. In other situations, like when 0.0 and 0.2 would indicate Amperes of current for example, you would certainly not call them equal. The truth is, that the epsilon is different in every situation, and that you should not look at the size of the floating point numbers.

If I am wrong and you're using it for some higher purpose, have a look at this:

http://www.gamedev.net/community/forums/topic.asp?topic_id=291163

As you can see, floating point variables cosist out of an exponent and a mantissa. The best thing you can do is:
1. Compare the exponents. If they are 2 or more a part then there's at least a factor 2 difference, and you would call them different (so false), else:
2. Compare the mantissae. Look at their difference, if it is bigger than some integer 'strictness' value (between 0 and 2^23), than they are not equal, else:
3. They are 'equal'

Good luck with whatever you're doing ;)
Bas

[Edited by - basananas on January 1, 2005 1:04:18 PM]
helpful part: It just doesn't work. FloatCmp(-10,-10) = false;

I wrote that code
inline bool FloatCmp(float a, float b){ // keep your code clean. Long nondescriptive parameter names // do not improve readability. // Prefixes in pure functions do not improve readability. if(abs(a-b)<abs_tolerance) return true;  if(a>0){  if(b>a*(1+rel_tolerance) || a>b*(1+rel_tolerance)) return false; }else{  if(b<a*(1+rel_tolerance) || a<b*(1+rel_tolerance)) return false; } return true;}

but i suggest you not to use such things.

As about performance, you can first check if difference of exponents of floats <= 1 , and then do rest, it should accelerate somewhat.

rant part:
I see flaw in entire idea to make universal "FloatCmp". There is no single reliable way to "compare floats for equality". Every "method" will fail in some uses. With bigger numbers you typically get bigger error, but how much it grow depends to what calculations you do. Usually you get linear dependence, but in some cases you can get quadratic. It is much better if every time you have need to compare floats, you think what maximal tolerance is acceptable _for software user_ in that specific case. For example, you need to check if point is on line. It is better to check if point is inside some reasonable cone, (for example, cone that corresponds to pixel size) and if it fails, you need bigger precision than floats...

In most of my programs, i compare floats with certain tolerance that depends to specific case. For example, i do not need big precision for pixel color values.

Float comparasion is very "innatural" operation, you should be able to avoid it. Usually you don't really need to compare floats but need to compare for specific well-known range. If you don't know range, it might be a sign that you are doing something conceptually wrong.

edit: as about exponents, it's almost as basananas said, except that you can't just compare matissas.
Example:
1.999999 and 2.0000001 have different exponents and therefore matissas is very different too. First is something like 1.1111111...0...*2^0 , second is like 0.000000...1...*2^1 , so matissas differ. Branching is costly, so handling such cases specially is not really much faster.

edit: and strictly speaking, float epsilon is more or less well-defined thing, and is used in iterative algorithms to stop when precision is good enough and will not improve anymore.
Dmytry:

Ya, you are right, it fails with minus-numbers. I didn't test it properly. I will work on that :)


Ok, thats a minor detail, however.


The reason why I want a generic epsilon to compare with, is to be able to set size of a world dynamically. Of course I could set the epsilon when I set the world parameters...

Basananas:
I will take your advice and look into exponent/mantissa.
But how do I get access directly to the mantissa and exponent values?

[Edited by - Mercenarey on January 1, 2005 10:23:51 AM]
Quote:CalvinI am only polite because I don't know enough foul languageQuote:Original post by superpigI think the reason your rating has dropped so much, Mercenarey, is that you come across as an arrogant asshole.
Quote:Original post by basananas
1. Compare the exponents. If they don't match then there's at least a factor 2 difference, and you would call the different, else:

Not true, the difference can be much smaller than a factor 2. For instance, 1.0 and 0.99999999 have different exponents.
Why not take the easy approach and write a function that evaluates a tolerance?
bool equal(float a, float b, float tol=.000001){ return (abs(a-b)<tol)?true:false }
Seems to me to be a heck of a lot easier than messing around with float internals, and achieves the same purpose/
Quote:Original post by thedustbustr
Why not take the easy approach and write a function that evaluates a tolerance?
bool equal(float a, float b, float tol=.000001){ return (abs(a-b)<tol)?true:false }
Seems to me to be a heck of a lot easier than messing around with float internals, and achieves the same purpose/

read posts above, esp. my post and code.
At bigger numbers you need bigger "tol"
Quote:Original post by Anonymous Poster
Quote:Original post by basananas
1. Compare the exponents. If they don't match then there's at least a factor 2 difference, and you would call the different, else:

Not true, the difference can be much smaller than a factor 2. For instance, 1.0 and 0.99999999 have different exponents.


Yeah, that's true. You'll have to look for a difference of at least 2, and if the difference is smaller, you'll have to look at the mantissa of both of the values.

I've written an example for you (Mercenarey that is :)) Use it, tear it apart and do whatever you want to do with it.

#include <iostream>using namespace std;#define STRICTNESS 1.2  //The strictness function check returns true if:                        //    (A/B) < STRICTNESS, in which A is larger than B                        //and A and B are the two compared floating points                        //STRICTNESS may be any value between 1.0 and 2.0int check(float * a, float * b);float a1 = 5;float a2 = 4.6;int main() {    cout << check (&a1, &a2);    return 0;}//Pre: both are of the same sign (either both positive or both negative)//Post: true if max(a,b)/min(a,b) < strictness, false otherwiseint check(float * a, float * b) {    unsigned int toIntA = * ((unsigned int *) a);    unsigned int toIntB = * ((unsigned int *) b);    //Check if the floats had the same sign    if (toIntA & 0x80000000) {                 //float a was negative (bit 0 = sign bit)       if (!(toIntB & 0x80000000)) return 0;   //Different signs, return false    }    //Calculate exponents and mantissae of a and b    unsigned int expA = ((toIntA & 0x7F800000) >> 23) - 127; //See the forum thread I was talking about...    unsigned int mantissaA = 0x800000 | (toIntA & 0x007FFFFF);        //idem dito, or ask on the forum if you don't understand it    unsigned int expB = ((toIntB & 0x7F800000) >> 23) - 127;    unsigned int mantissaB = 0x800000 | (toIntB & 0x007FFFFF);    if (expB - expA > 1 || expA - expB > 1) return 0;     //Considering that their exponents are at least 2 apart                                                          //expA/expB or expB/expA must be greater than 2, thus out of range    if (expA > expB) {        mantissaA = mantissaA << 1;        //expA is still greater than expB, so shift it's mantissa right    } else {        if (expB > expA) {            //Shift mantissaB right and swap it with mantissaA to make sure that A > B            unsigned int swap = mantissaA; mantissaA = mantissaB << 1; mantissaB = swap;        } else {            //Equal exponents            if (mantissaB > mantissaA) {                //swap the mantissae so that A > B                unsigned int swap = mantissaA; mantissaA = mantissaB; mantissaB = swap;            }        }    }    //Now just look if mantissaA / mantissaB smaller than the STRICTNESS value:    if ((double)mantissaA / (double)mantissaB < STRICTNESS) return 1;    return 0;}


Good luck,
Bas
thedustbustr:

Try this:
float aa = 10000000.0f;
float bb = 10000001.0f;

float cc = fabs(aa - bb);

cc will 1, even though the difference between the numbers is extremely small (percentage-wise), and your small epsilon will detect them as being different, even if you might wish to evaluate them to being equal.

Similarly, if you compare these:

float aa = 0.000000000000f;
float bb = 0.000000000001f;

float cc = fabs(aa - bb);

cc will be extremely small, and you need an even smaller epsilon to detect a difference. Your simple epsilon of 0.000001f will have no chance to see a difference - it will first see a difference when the numbers are a million times bigger!
That is the epsilon problem - you can not use the same epsilon for all magnitudes. And that is what I try to solve with my code.

For some numbers it will be far too small, for others it will be far too big.

basananas:
Thanx alot! Those ways to evaluate mantissa/exponent was just what I was looking for :)
Quote:CalvinI am only polite because I don't know enough foul languageQuote:Original post by superpigI think the reason your rating has dropped so much, Mercenarey, is that you come across as an arrogant asshole.
While basananas approach might work, I don't see why you would want to complicate things like that. Just try something like

return fabs(a - b) <= epsilon * max(fabs(a), fabs(b))

or

return fabs(a - b) <= epsilon * (fabs(a) + fabs(b))

This topic is closed to new replies.

Advertisement