The only thing I found so far are "error-free transformations". For any floating point numbers errors from a+b
, a-b
, and a*b
are also floating point numbers (in round to nearest mode, assuming no overflow/underflow etc. etc.).
Addition (and obviously subtraction) error is easy to compute; if abs(a) >= abs(b)
, error is exactly b-((a+b)-a)
(2 flops, or 4-5 if we don't know which is bigger). Multiplication error is trivial to compute with fma
- it is simply fma(a,b,-a*b)
. Without fma
it's 16 flops of rather nasty code. And fully generic emulation of correctly rounded fma
is even slower than that.
Extra 16 flops of error tracking per flop of real computation is a huge overkill, but with just 1-5 pipeline-friendly flops it's quite reasonable, and for many algorithms based on that 50%-200% overhead of error tracking and compensation results in error as small as if all calculations were done in twice the number of bits they were, avoiding ill-conditioning in many cases.
Interestingly, fma
isn't ever used in these algorithms to compute results, just to find errors, because finding error of fma
is a slow as finding error of multiplication was without fma
.
Relevant keywords to search would be "compensated Horner scheme" and "compensated dot product", with Horner scheme benefiting a lot more.