FloatCompareThread.txt

From: Steve Hollasch <stevehol@microsoft.com>
Subject: Floating-Point Epsilon Comparison
Date: Wednesday, April 26, 2000 2:57 PM


====  The Problem With Typical Floating-Point Comparison Code

I've been thinking about floating-point equivalence techniques lately and
the way that floating-point variance is typically handled.
    Usually, I'll see code like this:

        if (fabs(x-y) < 0.000001) then x and y are considered equal.

    I believe that this approach is fundamentally flawed for most uses;
this form of comparison is valid only for *tolerances*, which are absolute
measures based around some fixed coordinate point.  The salient point is
that the epsilon value (0.000001 above) is effectively anchored to a
particular range (that is, treated as fixed-point), while the other operands
are allowed to float normally, with varying exponents.
    Whenever I see this sort of operation performed, it's always of the form
listed above, and is typically laid down in layers, where the programmer is
both lazy and confused, and tweaks the epsilon value ad hoc until some
specific test passes.  This code usually fails for some subsequent condition
(when the two operands lie in some other region on the real line), at which
point folks typically throw up their hands and say "well, that's floating
point!"
    For many values of X and Y, this expression will be true ONLY if both
values are bitwise identical.  This will happen when the exponents of the
two operands are much larger than the exponent of the epsilon value
(specifically, when their exponents are larger by the number bits in the
mantissa).  For example, for some values of X and Y, the smallest
representable delta between the two may be something like 2^10.  In these
cases, the above expression works as though the epsilon were zero.
    For other values of X and Y, their exponents may be small enough that
the expression will be true no matter how badly scrambled the mantissa bits
are, as long as the exponents are unreachable by the epsilon value.  This
will happen when both  exponents are smaller by the number of bits in the
mantissa.
    In between the ranges that lie completely outside the epsilon's reach,
the effective epsilon will swing from zero to infinity.  The only time the
epsilon works as desired is when one of the operand exponents is zero.
    As I said, this all makes sense when you're manufacturing and need a
physical tolerance, but it's demonstrably lousy when used to compare results
for finite numerical equivalence.


====  In Search of a Better Comparison

What is really desired (in most situations I've seen) is a way to ask if the
numbers match to a certain degree in their *representation*.  More
specifically, given the IEEE 794 representation of a 32-bit floating point
number (which has a 1+23 bit mantissa), I'd like to compare two values
according to their 20 most significant bits, for example.  This would allow
me to compare any two numbers at any point in their representational range,
and would be much less brittle than the standard comparison shown above.
    I've not seen anything that addresses this issue.  Is this a recognized
problem with (hopefully) standard solutions?  My main goal here is that I
want to have an efficient way to compare two IEEE floating-point numbers
according to their significant bits, for any representable region.


====  Solution One:  Mantissa Truncation

(If you're unfamiliar with IEEE floating-point representation, see
<http://research.microsoft.com/~hollasch/cgindex/coding/ieeefloat.html> for
a quick overview of how floating-point values are encoded.)
    One way to attack this problem is to do significant-bit truncation.  To
do this, just treat the floating-point value as an integer, clear the low
order bits (some arbitrary number specified by the programmer) and perform a
floating-point compare.
    This has many benefits.  For one, it's relatively fast.  I say
relatively because you still have to drop it into memory or transfer to an
integer register to perform the integer operation of clearing the bits, and
then move it back to a floating point register (for most architectures.)
Nevertheless, unless this type of comparison is supported natively by the
underlying architecture, you're going to have to do integer twiddling on the
bits anyway.
    Even better, I believe this approach would work for all types of
floating pointer numbers:  normalized, denormalized, NaNs, infinities, and
zeros.  It's very desirable to avoid a decoding phase of these value types.
    The downside to this approach is that it doesn't do well at the
boundaries between two exponents, for example, the following two values will
compare differently with this scheme (using an 1+8 bit mantissa, and
clearing the low 3 mantissa bits):

        1.00000101 x 2^100
        0.11111110 x 2^100 (represented as 1.11111100 x 2^99)

would be clipped to

        1.00000000 x 2^100
        0.11111000 x 2^100 (represented as 1.11110000 x 2^99)


====  Solution Two:  Epsilon Synthesis

Another way to attack this problem is to synthesize a floating epsilon (as
compared to anchored) value that yields the desired behavior.  This way we
can get an epsilon that floats along with the two operands, rather than
being fettered lamely to zero.  In this approach, we're given the two
operands x & y, and a value q, which is the number of mantissa bits to
discard.  This comparison would look like this:

        [example:  x=1.11111100 x 2^99,    y=1.00000101 x 2^100,     and
q=3]

        (1)  Filter out non-finite numbers (NaNs, infinities) and handle as
desired.
        [doesn't apply for this example.]

        (2)  If the two exponents differ by more than one, then return
false.
        [doesn't apply for this example.]

        (3)  Construct an epsilon with the exponent of the largest number,
and the mantissa with a bit set in the q'th position.
        [example:  0.00000100 x 2^100, represented as 1.00000000 x 2^94]

        (4)  Use the typical comparison, but with the synthesized epsilon.
        [example:  if (fabs(x-y) < synthEpsilon) then x and y are considered
equal.]

    The drawback to this approach is that it's a good bit more complicated
than the first solution, and requires a fair amount of surgery on the
underlying floating-point representation.  I may also have missed some flaws
in this approach.
    The advantage to this approach is that it gives me pretty much what I
want, and is robust across the range of representable values.


====  Is There a Better Way?

What would be great is a straight-forward mathematical way to accomplish
this (disregarding performance implications.)  This would allow me to test
it out everywhere, or deploy it where performance is not critical, and would
work for all floating-point representations (though IEEE is what almost
everybody uses.)
    I'd also like to find out if this is a recognized problem, and what
other approaches are employed to address this.
--------------------------------------------------------------------------------------
From: mark carroll <carroll@cis.ohio-state.edu>
Subject: Re: Floating-Point Epsilon Comparison
Date: Wednesday, April 26, 2000 3:12 PM

Call me naive, but couldn't you just divide one number by the other
and see how close to 1 the result is? Should work well whether
x, y are around 10^-20 or 10^20!

-- Mark
--------------------------------------------------------------------------------------
From: Steve Hollasch <stevehol@microsoft.com>
Subject: Re: Floating-Point Epsilon Comparison
Date: Wednesday, April 26, 2000 3:27 PM


"mark carroll" <carroll@cis.ohio-state.edu>
<<  Call me naive, but couldn't you just divide one number by the other and
see how close to 1 the result is?  Should work well whether x, y are around
10^-20 or 10^20!  >>

    Good point; I forgot to mention this.
    First, the division will cost 70 cycles right away, so it's an expensive
solution.  However, I also solicited mathematical solutions where
performance was not an issue, and 70 cycles won't kill me in a lot of
situations.
    A second problem is that division itself is subject to precision loss,
so I could never really assert that two numbers were accurate to, say, 20
mantissa bits.  One could well dismiss this requirement as somewhat
artificial, though.
    So basically, this approach would use the following expression:

        (fabs(x/y - 1) < epsilon)

    Hmmm.  This would definitely be an improvement.  I'll have to cobble up
some tests to see how well it performs to epsilon synthesis.
--------------------------------------------------------------------------------------
From: Nick Maclaren <nmm1@cus.cam.ac.uk>
Subject: Re: Floating-Point Epsilon Comparison
Date: Wednesday, April 26, 2000 5:54 PM

In article <8e7fvf$i6m$1@news.uic.com>,
Steve Hollasch <stevehol@microsoft.com> wrote:
>
>"mark carroll" <carroll@cis.ohio-state.edu>
><<  Call me naive, but couldn't you just divide one number by the other and
>see how close to 1 the result is?  Should work well whether x, y are around
>10^-20 or 10^20!  >>
>
>    Good point; I forgot to mention this.
>    So basically, this approach would use the following expression:
>
>        (fabs(x/y - 1) < epsilon)
>
>    Hmmm.  This would definitely be an improvement.  I'll have to cobble up
>some tests to see how well it performs to epsilon synthesis.

That simple form is subject to spurious overflows and division
by zero.  The classic form is:
    fabs(x-y)/max((fabs(x),fabs(y)),DBL_MIN) < epsilon

If you look in the right books (ones on numerical analysis and
programming of the 1950s and 1960s), you will find the fixed
versus floating point debate thrashed out in great detail.  The
answer is that sometimes one is more appropriate, and sometimes
the other.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email:  nmm1@cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679
--------------------------------------------------------------------------------------
From: Ben Pfaff <pfaffben@msu.edu>
Subject: Re: Floating-Point Epsilon Comparison
Date: Wednesday, April 26, 2000 5:58 PM

nmm1@cus.cam.ac.uk (Nick Maclaren) writes:

> That simple form is subject to spurious overflows and division
> by zero.  The classic form is:
>     fabs(x-y)/max((fabs(x),fabs(y)),DBL_MIN) < epsilon
> 
> If you look in the right books (ones on numerical analysis and
> programming of the 1950s and 1960s), you will find the fixed
> versus floating point debate thrashed out in great detail.  The
> answer is that sometimes one is more appropriate, and sometimes
> the other.

Would you mind giving some references on that?  I wouldn't mind
adding a few books to my library.
--------------------------------------------------------------------------------------
From: Nick Maclaren <nmm1@cus.cam.ac.uk>
Subject: Re: Floating-Point Epsilon Comparison
Date: Wednesday, April 26, 2000 6:34 PM

In article <87ln20v81q.fsf@pfaffben.user.msu.edu>,
Ben Pfaff  <pfaffben@msu.edu> wrote:
>nmm1@cus.cam.ac.uk (Nick Maclaren) writes:
>
>> That simple form is subject to spurious overflows and division
>> by zero.  The classic form is:
>>     fabs(x-y)/max((fabs(x),fabs(y)),DBL_MIN) < epsilon
>> 
>> If you look in the right books (ones on numerical analysis and
>> programming of the 1950s and 1960s), you will find the fixed
>> versus floating point debate thrashed out in great detail.  The
>> answer is that sometimes one is more appropriate, and sometimes
>> the other.
>
>Would you mind giving some references on that?  I wouldn't mind
>adding a few books to my library.

I would if I could remember them!  Sorry, but it is just too long
ago.  It is worth looking up anything by Wilkinson, and one
relevant search term is "automatic scaling".  Try asking on
sci.math.num-analysis.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email:  nmm1@cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679
--------------------------------------------------------------------------------------
From: Steve Hollasch <stevehol@microsoft.com>
Subject: Re: Floating-Point Epsilon Comparison
Date: Wednesday, April 26, 2000 6:21 PM


"Nick Maclaren" <nmm1@cus.cam.ac.uk> wrote:
<<
The classic form is:
    fabs(x-y)/max((fabs(x),fabs(y)),DBL_MIN) < epsilon
>>

    There seems to be a typo in that equation: (fabs(x),fabs(y)).  Seems
like there should be an operator instead of a comma.  Subtraction doesn't
seem to work:  if x=r and y=r+d, then the equation evaluates to (d/DBL_MIN <
epsilon), which doesn't make sense.

   If it's a divide, then the equation evaluates to (d+(d^2/r) < epsilon),
which I can't quite get my head around.
--------------------------------------------------------------------------------------
From: Nick Maclaren <nmm1@cus.cam.ac.uk>
Subject: Re: Floating-Point Epsilon Comparison
Date: Wednesday, April 26, 2000 6:39 PM

In article <8e7q66$imm$1@news.uic.com>,
Steve Hollasch <stevehol@microsoft.com> wrote:
>
>"Nick Maclaren" <nmm1@cus.cam.ac.uk> wrote:
><<
>The classic form is:
>    fabs(x-y)/max((fabs(x),fabs(y)),DBL_MIN) < epsilon
>>>
>
>    There seems to be a typo in that equation: (fabs(x),fabs(y)).  Seems
>like there should be an operator instead of a comma.  Subtraction doesn't
>seem to work:  if x=r and y=r+d, then the equation evaluates to (d/DBL_MIN <
>epsilon), which doesn't make sense.

Er, yes, you are quite right.

    fabs(x-y)/max(max(fabs(x),fabs(y)),DBL_MIN) < epsilon

Or, in a suitable language:

    abs(x-y)/max(abs(x),abs(y),DBL_MIN) < epsilon

The only point of the DBL_MIN is to avoid a division by zero if
both operands are zero.  Or, actually, if both are subnormal and
the hardware prenormalises for division but not negation and
comparison!  I can name several systems that did and do that,
so it is not merely a theoretical point.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email:  nmm1@cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679
--------------------------------------------------------------------------------------
From: Eric Rudd <rudd@cyberoptics.com>
Subject: Re: Floating-Point Epsilon Comparison
Date: Thursday, April 27, 2000 3:04 PM

Nick Maclaren wrote:

> >    So basically, this approach would use the following expression:
> >
> >        (fabs(x/y - 1) < epsilon)
>
> That simple form is subject to spurious overflows and division by zero.  The
> classic form is:
>     fabs(x-y)/max(max(fabs(x),fabs(y)),DBL_MIN) < epsilon

There was another trick used in Linpack, along the following lines:  If one
wishes to see if fabs(x-y) is smaller than epsilon*max(fabs(x),fabs(y)), one can
do the following test:

   sum = max(fabs(x),fabs(y));
   diff = fabs(x-y);
   if (sum + diff == sum) {
      /* Nearly equal */
   } else {
      /* Significantly unequal */
   }

The idea is that if "diff" gets lost in the addition to "sum", it is smaller
than epsilon*sum.  The code thus does an implicit test of the machine precision.

Unfortunately, on platforms where intermediate results are held to higher
precision (such as Intel coprocessors), the test doesn't work very well, which
makes it less portable than one would like.  Furthermore, compilers often ignore
one's pleas for arithmetic consistency:

   if ((double) (temp + diff) == (double) temp) ...

so the method has fallen into disuse.  However, it is too cute to be forgotten
entirely.

-Eric Rudd
rudd@cyberoptics.com
--------------------------------------------------------------------------------------
From: Dave Eberly <eberly@magic-software.com>
Subject: Re: Floating-Point Epsilon Comparison
Date: Thursday, April 27, 2000 12:43 AM

Steve Hollasch <stevehol@microsoft.com> wrote in message
news:8e7fvf$i6m$1@news.uic.com...
>
> "mark carroll" <carroll@cis.ohio-state.edu>
> <<  Call me naive, but couldn't you just divide one number by the other
and
> see how close to 1 the result is?  Should work well whether x, y are
around
> 10^-20 or 10^20!  >>
>
>     Good point; I forgot to mention this.
>     First, the division will cost 70 cycles right away, so it's an
expensive
> solution.  However, I also solicited mathematical solutions where
> performance was not an issue, and 70 cycles won't kill me in a lot of
> situations.
>     A second problem is that division itself is subject to precision loss,
> so I could never really assert that two numbers were accurate to, say, 20
> mantissa bits.  One could well dismiss this requirement as somewhat
> artificial, though.
>     So basically, this approach would use the following expression:
>
>         (fabs(x/y - 1) < epsilon)
>
>     Hmmm.  This would definitely be an improvement.  I'll have to cobble
up
> some tests to see how well it performs to epsilon synthesis.

Don't do the division.  Use something like fabs(x-y) < epsilon*fabs(y).
Numerical Recipes has a number of algorithms that deal with the
"epsilon" issue.  For example, see section 10.1 on Golden Section
Search in One Dimension.

--
Dave Eberly
eberly@magic-software.com
http://www.magic-software.com
--------------------------------------------------------------------------------------
From: Nick Maclaren <nmm1@cus.cam.ac.uk>
Subject: Re: Floating-Point Epsilon Comparison
Date: Thursday, April 27, 2000 4:32 AM

In article <8e8h4r$tth$1@slb3.atl.mindspring.net>,
Dave Eberly <eberly@magic-software.com> wrote:
>
>Don't do the division.  Use something like fabs(x-y) < epsilon*fabs(y).
>Numerical Recipes has a number of algorithms that deal with the
>"epsilon" issue.  For example, see section 10.1 on Golden Section
>Search in One Dimension.

Given the gross unreliability of that book, I suggest not bothering.
Acton is a reliable author, as someone else mentioned.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email:  nmm1@cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679
--------------------------------------------------------------------------------------
From: Dave Eberly <eberly@magic-software.com>
Subject: Re: Floating-Point Epsilon Comparison
Date: Thursday, April 27, 2000 7:42 AM

Nick Maclaren <nmm1@cus.cam.ac.uk> wrote in message
news:8e8tun$c8o$1@pegasus.csx.cam.ac.uk...
>
> Given the gross unreliability of that book, I suggest not bothering.
> Acton is a reliable author, as someone else mentioned.

Although *some* of the source code in the book has a reputation
for unreliability, much of it works well.  However, the textual
descriptions are quite good (what I was referring to in my last
post).  Every book has errors.  That should not stop you from
reading and deciding for yourself what is accurate.

--
Dave Eberly
eberly@magic-software.com
http://www.magic-software.com
--------------------------------------------------------------------------------------
From: Nick Maclaren <nmm1@cus.cam.ac.uk>
Subject: Re: Floating-Point Epsilon Comparison
Date: Thursday, April 27, 2000 8:00 AM


In article <8e99kq$i3f$1@slb0.atl.mindspring.net>, "Dave Eberly" <eberly@magic-software.com> writes:
|> Nick Maclaren <nmm1@cus.cam.ac.uk> wrote in message
|> news:8e8tun$c8o$1@pegasus.csx.cam.ac.uk...
|> >
|> > Given the gross unreliability of that book, I suggest not bothering.
|> > Acton is a reliable author, as someone else mentioned.
|> 
|> Although *some* of the source code in the book has a reputation
|> for unreliability, much of it works well.  However, the textual
|> descriptions are quite good (what I was referring to in my last
|> post).  Every book has errors.  That should not stop you from
|> reading and deciding for yourself what is accurate.

Actually, in the sections that I read, the textual descriptions
were every bit as dangerously misleading as the code.  Perhaps
half of it is reliable - which is the problem - because the other
half varies from the unreliable to the catastrophic.

If anyone is capable of judging which bits of it are reliable
and which aren't, they aren't going to get anything useful from
it, anyway.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email:  nmm1@cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679
--------------------------------------------------------------------------------------
From: bubba <bubba@aol.com>
Subject: Re: Floating-Point Epsilon Comparison (70 cycles!)
Date: Thursday, April 27, 2000 9:36 PM

70 cycles not! Unless you still use an Inlet 486 processor. The famous
Intel Pentium fdiv bug arose from a botched algorithm implementation that
handles two bits per cycle. More modern algorithms retire up to 12 bits
per cycle, if I remember correctly.


"Steve Hollasch" <stevehol@microsoft.com> wrote in message
news:8e7fvf$i6m$1@news.uic.com...
>
> "mark carroll" <carroll@cis.ohio-state.edu>
> <<  Call me naive, but couldn't you just divide one number by the other
and
> see how close to 1 the result is?  Should work well whether x, y are
around
> 10^-20 or 10^20!  >>
>
>     Good point; I forgot to mention this.
>     First, the division will cost 70 cycles right away, so it's an
expensive
> solution.  However, I also solicited mathematical solutions where
> performance was not an issue, and 70 cycles won't kill me in a lot of
> situations.
>     A second problem is that division itself is subject to precision loss,
> so I could never really assert that two numbers were accurate to, say, 20
> mantissa bits.  One could well dismiss this requirement as somewhat
> artificial, though.
>     So basically, this approach would use the following expression:
>
>         (fabs(x/y - 1) < epsilon)
>
>     Hmmm.  This would definitely be an improvement.  I'll have to cobble
up
> some tests to see how well it performs to epsilon synthesis.
--------------------------------------------------------------------------------------
From: Andrew Bromage <bromage@goaway.cc.monash.edu.au>
Subject: Re: Floating-Point Epsilon Comparison
Date: Wednesday, April 26, 2000 8:49 PM

G'day all.

"Steve Hollasch" <stevehol@microsoft.com> writes:

>====  The Problem With Typical Floating-Point Comparison Code

>I've been thinking about floating-point equivalence techniques lately and
>the way that floating-point variance is typically handled.
>    Usually, I'll see code like this:

>        if (fabs(x-y) < 0.000001) then x and y are considered equal.

>    I believe that this approach is fundamentally flawed for most uses;

You're probably right, if by "most uses" you mean "most instances of
its use in code".  However, it's perfectly acceptable if you know
something about x and y.  For example:

	- If x and y already been sanity checked to fall with in some
	  appropriate range, it could be fine.

	- If they measure something "real world", the tolerance might
	  be a real measurement.  For example, suppose it's a computer
	  graphics application, and x and y are raster coordinates,
	  then you might say that "anything less than 1/64th of a
	  pixel in size is insignificant for the purposes of this
	  comparison".

>    Whenever I see this sort of operation performed, it's always of the form
>listed above, and is typically laid down in layers, where the programmer is
>both lazy and confused, and tweaks the epsilon value ad hoc until some
>specific test passes. This code usually fails for some subsequent condition
>(when the two operands lie in some other region on the real line), at which
>point folks typically throw up their hands and say "well, that's floating
>point!"

If you hired a numeric analyst who writes code like this, sack the
individual immediately. :-)

>(If you're unfamiliar with IEEE floating-point representation, see
><http://research.microsoft.com/~hollasch/cgindex/coding/ieeefloat.html> for
>a quick overview of how floating-point values are encoded.)
>    One way to attack this problem is to do significant-bit truncation.  To
>do this, just treat the floating-point value as an integer, clear the low
>order bits (some arbitrary number specified by the programmer) and perform a
>floating-point compare.
[...]
>Another way to attack this problem is to synthesize a floating epsilon (as
>compared to anchored) value that yields the desired behavior.  This way we
>can get an epsilon that floats along with the two operands, rather than
>being fettered lamely to zero.  In this approach, we're given the two
>operands x & y, and a value q, which is the number of mantissa bits to
>discard.

You can do a merge of the two without resorting to surgery on the IEEE
representation by using the all important quantity called EPSILON.  In
C you will find it in <float.h> and it will be called FLT_EPSILON or
DBL_EPSILON depending on whether you're using floats or doubles.  EPSILON
is defined as:

	EPSILON is the smallest number such that 1.0+EPSILON != 1.0.

It can also be defined as:

	EPSILON is the smallest relative spacing between two
	consecutive floating point numbers.  That is, EPSILON is the
	smallest value of fabs((x-y)/x) where x != y.

<pedantic>
This isn't quite true you are also using denormalised numbers, but
we'll ignore that for now.
</pedantic>

The largest relative spacing between two consecutive numbers is also
easy to obtain: It's simply RADIX*EPSILON.  The reason why is left as
an exercise to the reader.

So to compare two normalised numbers by ignoring the rightmost N bits
of precision (approximately), simply use:

	if (fabs((x-y)/x) < EPSILON * (1<<(N+1))) { }

Of course this won't work for x==0.0.  If x==0.0, use y in the
denominator, unless y==0.0 too, in which case they are trivially
equal. :-)

> [...]it's relatively fast.

Surely we gave up "relatively fast" when we wanted numeric accuracy? :-)

Cheers,
Andrew Bromage
--------------------------------------------------------------------------------------
From: Nick Maclaren <nmm1@cus.cam.ac.uk>
Subject: Re: Floating-Point Epsilon Comparison
Date: Thursday, April 27, 2000 4:31 AM

In article <8e82r3$fp0@goaway.cc.monash.edu.au>,
Andrew Bromage <bromage@goaway.cc.monash.edu.au> wrote:
>"Steve Hollasch" <stevehol@microsoft.com> writes:
>
>>I've been thinking about floating-point equivalence techniques lately and
>>the way that floating-point variance is typically handled.
>>    Usually, I'll see code like this:
>
>>        if (fabs(x-y) < 0.000001) then x and y are considered equal.
>
>>    I believe that this approach is fundamentally flawed for most uses;
>
>If you hired a numeric analyst who writes code like this, sack the
>individual immediately. :-)

Only if you want to get sued :-)

There are many cases where it is the right test, and the relative
test is the wrong one!  For example, take simple matrix decomposition
(Cholesky, Gaussian, LU etc.)

Many of the 'small number' limits are of the form A*N*eps, where A
is the largest element of the matrix, N its size and eps is machine
epsilon.  Differences below that are likely to be due to rounding
error, and the simple code should use the absolute form of test
with that as the limit value.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email:  nmm1@cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679
--------------------------------------------------------------------------------------
From: Andrew Bromage <bromage@goaway.cc.monash.edu.au>
Subject: Re: Floating-Point Epsilon Comparison
Date: Thursday, April 27, 2000 9:15 PM

G'day all.

Steve Hollasch <stevehol@microsoft.com> writes:

>>>        if (fabs(x-y) < 0.000001) then x and y are considered equal.

I said:

>>If you hired a numeric analyst who writes code like this, sack the
>>individual immediately. :-)

nmm1@cus.cam.ac.uk (Nick Maclaren) writes:

>Only if you want to get sued :-)

Any reasonable coding standard will have a clause which says "no magic
numbers", so I still feel justified in reprimanding a coder for writing
that fragment. :-)

>For example, take simple matrix decomposition
>(Cholesky, Gaussian, LU etc.)

There are also the cases (as we have mentioned) where you are dealing
with physically significant numbers, such as device coordinates in
computer graphics.  If x and y have already been sanity checked to
ensure that the numbers make some sort of sense in the domain in
question, using a fixed tolerance can be appropriate.

Cheers,
Andrew Bromage
--------------------------------------------------------------------------------------
From: Don Taylor <dont@agora.rdrop.com>
Subject: Re: Floating-Point Epsilon Comparison
Date: Wednesday, April 26, 2000 11:37 PM

In comp.programming Steve Hollasch <stevehol@microsoft.com> wrote:
> ====  The Problem With Typical Floating-Point Comparison Code

> I've been thinking about floating-point equivalence techniques lately and
> the way that floating-point variance is typically handled.
>     Usually, I'll see code like this:

>         if (fabs(x-y) < 0.000001) then x and y are considered equal.
<snip>

For those of you interested in floating point programming and accuracy
Barnes & Noble had Forman Acton's "Real Computing Made Real, Preventing
Errors in Scientific and Engineering Calculations" for $9.95 in their
bargain section.  They also had a variety of other rather serious math
texts but some of them were "mis-shelved" and in odd sections.  I think
this is a terrific book for introducing people to quality mathematical
programming.  (And there are discount codes around that provide even
greater discounts if you order a certain quantity of books)

Just to make it really clear, I don't make a dime off this, I'm just
pointing folks towards a bargain book if you do floating point programming.

I'm always interested in finding good rules for avoiding errors in
mathematical programming.  Please let me know if you find some of these.

thank you


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
--------------------------------------------------------------------------------------
From: Daniel Neville <bmange@airdmhor.gen.nz>
Subject: Re: Floating-Point Epsilon Comparison
Date: Thursday, April 27, 2000 5:19 AM

Steve Hollasch wrote:
> 
> ====  Solution One:  Mantissa Truncation
[...]
>     The downside to this approach is that it doesn't do well at the
> boundaries between two exponents,

How about:

const EpsilonMask = 0xFFFFFF00;
const HalfEpsilon = 0x00000080;

// Assume A and B are considered the same value.
if (AAsBits xor BAsBits) and EpsilonMask) <> 0 then begin
  if (((AAsBits + HalfEpsilon) xor (BAsBits + HalfEpsilon)) and
  EpsilonMask) <> 0 then begin
    // Regard A and B as different values.
  end;
end;

?

I haven't thought about the possibility of denormalised numbers or NANs,
though.

Cheers.


-- 
   Blancmange
   bmange@airdmhor.gen.nz
   http://www.airdmhor.gen.nz/blancmange/index.html
--------------------------------------------------------------------------------------
From: Steve Hollasch <stevehol@microsoft.com>
Subject: Re: Floating-Point Epsilon Comparison
Date: Thursday, April 27, 2000 1:39 PM


[In my original article, I pointed out that mantissa truncation has the
undesirable effect that numbers at the boundary will compare as unequal if
they straddle the boundaries between two exponents.]

Daniel Neville proposes the following alternative algorithm:

        const EpsilonMask = 0xFFFFFF00;
        const HalfEpsilon = 0x00000080;

        // Assume A and B are considered the same value.
        if (AAsBits xor BAsBits) and EpsilonMask) <> 0 then begin
              if (((AAsBits + HalfEpsilon) xor (BAsBits + HalfEpsilon)) and
  EpsilonMask) <> 0 then begin
              // Regard A and B as different values.
              end;
        end;



    The problem with this solution is that it will still judge the two
values as unequal if they straddle an exponent boundary and differ by more
than HalfEpsilon, since this HalfEpsilon shift will just push the larger
value into the next exponent range.  That is, it will only catch about half
the cases where the two values differ in range by the exponent-adjusted
epsilon.
--------------------------------------------------------------------------------------
From: Sean Montgomery <sean@iware.com>
Subject: Re: Floating-Point Epsilon Comparison
Date: Thursday, April 27, 2000 11:22 AM

FWIW, the March 2000 issue of C++ Report had an article on just this kinda
stuff...
--------------------------------------------------------------------------------------


