Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
494 views
in Technique[技术] by (71.8m points)

floating point - double-double implementation resilient to FPU rounding mode

Context: double-double arithmetic

“Double-double” is a representation of numbers as the sum of two double-precision numbers without overlap in the significands. This representation takes advantage of existing double-precision hardware implementations for “near quadruple-precision” computations.

One typical low-level C function in a double-double implementation may take two double-precision numbers a and b with |a| ≥ |b| and compute the double-double number (s, e) that represents their sum:

s = a + b;
e = b - (s - a);

(Adapted from this article.)

These implementations typically assume round-to-nearest-even mode.

In the above computation, (s, e) is a normalized double-double only because of this assumption. Without it, with a == 0x1.0p60, b == 1, in round-upward mode, s is computed as 0x1.0000000000001p60 and e a bit above -0x0.0000000000001p60. Their sum is equal to the mathematical sum of a and b but their significands overlap.

Take a == 0x1.0p120 and the mathematical sums of a and b on the one hand and s and e on the other hand do not even coincide any more.

Question

Is there a way to build a double-double-like library with the same properties that a typical double-double library has in round-to-nearest-even (that is, relatively fast and relatively accurate), but that works whatever the rounding mode happens to be?

Does such a library already exist?

More general context: correctly rounded elementary functions

Implementations of the double-double sort are used for intermediate computations in the implementation of libraries of correctly rounded elementary functions. As a result, libraries implemented this way tend to fail spectacularly when a function is called while the FPU is not in round-to-nearest-even mode. Changing the rounding mode inside the function is not very palatable, for performance reasons and because a signal arriving while the function is executing would leave the FPU in round-to-nearest-even mode. The simplest way I see to have fast, correctly rounded elementary functions that work in any rounding mode would be if one could somehow rely on a double-double-kind of arithmetic that worked in any rounding mode.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The article referred to by njuffa offers the function below, with very similar notations to those of my question, except that what is denoted fl (a+b) there is simply denoted a+b in my question:

Two?Sum?toward?zero2 (a, b)

if (|a| < |b|)
  swap (a , b)
s = fl (a + b)
d = fl (s ? a)
e = fl (b ? d)
if(|2 ? b|<|d|)
  s = a, e = b
return (s, e)

This is a very neat fix for this particular elementary computation when in round-toward-zero mode. It makes one hope that something would be possible for implementing a correctly rounded elementary function, at the very least by testing the rounding mode early and picking separate algorithms, or perhaps by writing very careful code that works for all rounding modes.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...