Advertisement Jump to content
Sign in to follow this  
iraxef

scriptMath and closeTo()

This topic is 1791 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

The article you cited (in rev 1832) looked familiar. For reference, here is what we use:

namespace RelativeFltCmp
{
    template <size_t bytes>
    struct TypeWithSize
    {
        typedef void UInt;
    };

    template <>
    struct TypeWithSize<4>
    {
        typedef uint32_t UInt;
    };

    template <>
    struct TypeWithSize<8>
    {
        typedef uint64_t UInt;
    };

    template <typename RawType>
    class FloatingPoint
    {
    public:
        typedef typename TypeWithSize<sizeof(RawType)>::UInt Bits;

        static const size_t kBitCount = 8*sizeof(RawType);
        static const size_t kFracBitCount = std::numeric_limits<RawType>::digits - 1;
        static const size_t kExpBitCount = kBitCount - 1 - kFracBitCount;

        static const Bits kSignBitMask = static_cast<Bits>(1) << (kBitCount - 1);
        static const Bits kFracBitMask = ~static_cast<Bits>(0) >> (kExpBitCount + 1);
        static const Bits kExpBitMask = ~(kSignBitMask | kFracBitMask);

        static const size_t kMaxUlps = 4;

        explicit FloatingPoint(
            const RawType& x )
            : value_(x)
        {}

        inline bool AlmostEquals(
            const FloatingPoint& rhs ) const
        {
            // check for NaN to match == behavior
            if (is_nan() || rhs.is_nan())
                return false;

            return ULP_diff(bits_, rhs.bits_) <= kMaxUlps;
        }

    private:
        inline bool is_nan() const
        {
            return ((kExpBitMask & bits_) == kExpBitMask) &&
                ((kFracBitMask & bits_) != 0);
        }

        inline Bits SignAndMagnitudeToBiased(
            const Bits& sam ) const
        {
            if (kSignBitMask & sam)
                return ~sam + 1;  // two's complement
            else
                return kSignBitMask | sam;  // * 2
        }

        inline Bits ULP_diff(
            const Bits& sam1,
            const Bits& sam2 ) const
        {
            const Bits biased1 = SignAndMagnitudeToBiased(sam1);
            const Bits biased2 = SignAndMagnitudeToBiased(sam2);

            return (biased1 >= biased2) ? (biased1 - biased2) : (biased2 - biased1);
        }

        union
        {
            RawType value_;
            Bits bits_;
        };
    };
}

static inline bool FltCmp(
    float lhs,
    float rhs )
{
    return RelativeFltCmp::FloatingPoint<float>(lhs).AlmostEquals(RelativeFltCmp::FloatingPoint<float>(rhs));
}


static inline bool FltCmp(
    double lhs,
    double rhs )
{
    return RelativeFltCmp::FloatingPoint<double>(lhs).AlmostEquals(RelativeFltCmp::FloatingPoint<double>(rhs));
}

Edited by iraxef

Share this post


Link to post
Share on other sites
Advertisement

Actually that link (http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm) mentions that it's obsolete and points to: http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

 

Testing both closeTo (implemented in r1832) and the FltCmp posted above, both fail his so-called 'catastrophic cancellation' (comparing the result of a subtraction that's very close to 0 against 0; e.g. comparing 67329.242f - 67329.234f [they're 1 ulp apart] to 0.f).

 

His AlmostEqualUlpsAndAbs() works (see below and the attached test file), though it's not practical for all (most?) situations since you'd have to know the original numbers (the 67329.242f and 67329.234f, not just the end-result of the subtraction).

FltCmp(diff,0) = 0
FltCmp(sin(PI),0) = 0

closeTo(diff,0) = 0
closeTo(sin(PI),0) = 0

AlmostEqualUlpsAndAbs(diff,0) = 1
AlmostEqualUlpsAndAbs(sin(PI),0) = 1

That being said, closeTo() has (in the worst case) 3 fabs() and a divide, whereas FltCmp is "more likely to be efficient on architectures such as SSE which encourage the reinterpreting of floats as integers". But that's neither here nor there.

 

Is anyone (reading this forum) using something like AlmostEqualUlpsAndAbs() in production code?

Edited by iraxef

Share this post


Link to post
Share on other sites

I don't worry too much about performance. I doubt closeTo() is going to be called that many times from a script that it is going to be a causing a significant impact on the response time of the overall script execution.

 

I am also a bit wary about doing operations directly on the bits of the float values. Though for sure almost all platforms today use the IEEE734 standard for the float values, I shouldn't make that assumption in the library code. It would be fine to do so in application code where you are in control of on what platforms the code will be used, but with AngelScript I'm trying to be as portable as possible.

 

I'll definitely look into the case you mentioned above. The code must not fail to recognize these numbers are being close to each other.

Share this post


Link to post
Share on other sites

You could use the AlmostEqualRelativeAndAbs() variant from the article (which uses a relative epsilon for the away-from-0 case and not ulps, and therefore doesn't cast the float to ints). See updated attachment.

FltCmp(diff,0) = 0
FltCmp(sin(PI),0) = 0

AlmostEqualUlpsAndAbs(diff,0) = 1
AlmostEqualUlpsAndAbs(sin(PI),0) = 1

AlmostEqualRelativeAndAbs(diff,0) = 1
AlmostEqualRelativeAndAbs(sin(PI),0) = 1

closeTo(diff,0) = 0
closeTo(sin(PI),0) = 0

You've still got the issue of choosing the right close-to-0 and max-relative diffs. 

 

My understanding so far from reading the article is that the issue lies in the fact that there is a ton of float precision near 0 and significantly less far away from 0. His example is 67329.234 and 67329.242. There are no valid floats between those two numbers, so they're 1 'ulp' away [if you interpret the floats as ints and subtract the ints]. However, if you subtract those two floats, you get .0078125000. Because of the high precision around 0, that's actually 1,006,632,960 ulps away from 0 [there are 1,006,632,960 valid floats between 0 and .0078125000]. 0.007 is much larger than any typical epsilon value [e.g. closeTo defaults of 0.0001f]. To be able to properly detect that the subtraction of those two large floats is close to 0, you need to have the floats themselves available so that you can determine the proper close-to-0 epsilon to use. Which is still depending on the user to do the right thing, but at least if they do, AlmostEqualRelativeAndAbs() does the correct thing (unlike FltCmp() and closeTo()).

 

Per the author's suggestion in one of the article comments, I'd default 'maxDiff' to the value of FLT_EPSILON and 'maxRelDiff' to something like 3*FLT_EPSILON or 4*FLT_EPSILON (typical epsilons). Along with a comment at the top of the function, maybe, explaining how to provide the correct 'maxDiff' for the close-to-0 case if you have the numbers available.

Share this post


Link to post
Share on other sites

A float only has precision for about 7 digits. It doesn't matter what the value of the exponent is. In comparison a double has 15 digits.

 

Internally the binary representation for 67329.234 and 0.673929234 is much the same, the difference is in the exponent part.

 

The comparison for closeness must be aware of this, which is why the absolute difference cannot be used directly (unless both values are close to zero).

 

 

Another way of thinking about it, is to have the epsilon value change according to the magnitude of the numbers. If the numbers are in the order of 10000, then the epsilon should be around 0.1. If the numbers are in the order of 1, then epsilon should be around 0.00001, and so on.

Share this post


Link to post
Share on other sites

The idea of choosing an epsilon based on the magnitude of the original numbers is precisely what the article suggests (their version is multiplying the bigger number by FLT_EPSILON, as I do in the attached example code). The problem is when you're comparing e.g. 0.007 with 0 and at that point you don't have the original numbers (which were subtracted), which were on the order of 60000. If you looked at the magnitude of 0.007 and 0, you'd conclude that you want a tiny epsilon.. but that doesn't work for the subtraction which was done. So again, it's up to the user to choose the epsilon and if they had the original numbers available (to judge their magnitude), something like max(A,B)*FLT_EPSILON is more precise.

Share this post


Link to post
Share on other sites

To follow up on my previous code posting, we're now trying something like this:

/**
 * Check whether two floats are close enough to be considered equal.
 *
 * @param    lhs                left hand size of comparison
 * @param    lhs                right hand size of comparison
 * @param    nearZeroEpsilon    absolute epsilon to use for numbers that should be
 *                              at or near 0. Default of FLT_MIN is fine for ensuring
 *                              that subnormals/denormals (dividing by which would produce
 *                              infinity) are considered equal to 0. For 'catastrophic cancellation'
 *                              situations, like comparing sin(pi) or (67329.242f - 67329.234f)
 *                              against 0, maxInput*FLT_EPSILON may be a good epsilon
 *                              (e.g. pi*FLT_EPSILON or 67329.242f*FLT_EPSILON).
 *                              See http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition
 * @param    ulpDiff            Used when numbers are not being compared against 0.
 *
 * @return   True if values are considered equal.
 */
static inline bool FltCmp(
    float lhs,
    float rhs,
    float nearZeroEpsilon = FLT_MIN,
    float ulpDiff = 4 )
{
    if ( (lhs == rhs) || ((rhs == 0.f || lhs == 0.f) && fabsf(lhs-rhs) <= nearZeroEpsilon) )
        return true;

    return RelativeFltCmp::FloatingPoint<float>(lhs).AlmostEquals(RelativeFltCmp::FloatingPoint<float>(rhs), ulpDiff);
}

And the binding for that function is now:

RegisterGlobalFunction("bool FltCmp(float, float, float = 1.175494351e-38, uint = 4)", asFUNCTIONPR(FltCmp, (float, float, float, uint), bool), asCALL_CDECL)

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!