views:

342

answers:

8

I'm computing the ordinate y of a point on a line at a given abscissa x. The line is defined by its two end points coordinates (x0,y0)(x1,y1). End points coordinates are floats and the computation must be done in float precision for use in GPU.

The maths, and thus the naive implementation, are trivial.

Let t = (x - x0)/(x1 - x0), then y = (1 - t) * y0 + t * y1 = y0 + t * (y1 - y0).

The problem is when x1 - x0 is small. The result will introduce cancellation error. When combined with the one of x - x0, in the division I expect a significant error in t.

The question is if there exist another way to determine y with a better accuracy ?

i.e. should I compute (x - x0)*(y1 - y0) first, and divide by (x1 - x0) after ?

The difference y1 - y0 will always be big.

+1  A: 

If you have the possibility to do it, you can introduce two cases in your computation, depending on abs(x1-x0) < abs(y1-y0). In the vertical case abs(x1-x0) < abs(y1-y0), compute x from y instead of y from x.

EDIT. Another possibility would be to obtain the result bit by bit using a variant of dichotomic search. This will be slower, but may improve the result in extreme cases.

// Input is X
xmin = min(x0,x1);
xmax = max(x0,x1);
ymin = min(y0,y1);
ymax = max(y0,y1);
for (int i=0;i<20;i++) // get 20 bits in result
{
  xmid = (xmin+xmax)*0.5;
  ymid = (ymin+ymax)*0.5;
  if ( x < xmid ) { xmax = xmid; ymax = ymid; } // first half
  else { xmin = xmid; ymin = ymid; } // second half
}
// Output is some value in [ymin,ymax]
Y = ymin;
Eric Bainville
This is unfortunately not possible. I have to find y for a given x value.
chmike
And it's not even possible if the line is parallell to the y axis. See my post.
Magnus Skog
This is the method described in the Cohen-Sutherland clipping algorithm. It is smart but as you say not very efficient. I thought of converting it to integer computation so we know how many accurate bits we have.
chmike
A: 

If your source data is already a float then you already have fundamental inaccuracy.

To explain further, imagine if you were doing this graphically. You have a 2D sheet of graph paper, and 2 point marked.

Case 1: Those points are very accurate, and have been marked with a very sharp pencil. Its easy to draw the line joining them, and easy to then get y given x (or vice versa).

Case 2: These point have been marked with a big fat felt tip pen, like a bingo marker. Clearly the line you draw will be less accurate. Do you go through the centre of the spots? The top edge? The bottom edge? Top of one, bottom of the other? Clearly there are many different options. If the two dots are close to each other then the variation will be even greater.

Floats have a certain level of inaccuracy inherent in them, due to the way they represent numbers, ergo they correspond more to case 2 than case 1 (which one could suggest is the equivalent of using an arbitrary precision librray). No algorithm in the world can compensate for that. Imprecise data in, Imprecise data out

Visage
You are right about the float accuracy but wrong on your last sentence. See http://docs.sun.com/source/806-3568/ncg_goldberg.html.
chmike
Oops. After reading again, you address the problem of the representation precision where I address the problem of precision in computation. When computing the mean or variance, the Knuth algorithm provides better accuracy than the naive implementation even if mathematically correct.
chmike
There are times when this fundamental inaccuracy has more effect. for example consider the difference in result in 5/(5-x) when x is 4.002 vs 4.001 vs 4.0, and when x is 5.002 vs 5.001 vs 5.0. closer to 5, the inaccuracy has greater effect.here, "The problem is when x1 - x0 is small.".
Liran Orevi
A: 

Check if the distance between x0 and x1 is small, i.e. fabs(x1 - x0) < eps. Then the line is parallell to the y axis of the coordinate system, i.e. you can't calculuate the y values of that line depending on x. You have infinite many y values and therefore you have to treat this case differently.

Magnus Skog
You may assume that the condition x0 < x < x1 has been previously checked. This ensures the line is crossing the x abscissa. Other use cases are handled differently and without an equivalent precision difficulty.
chmike
I don't think you understand what I mean. "x0 < x < x1" might still be valid if the line is parallell to the y axis. Even if differ with e-8 accuracy that inequality is still true, when in fact you should treat x0 and x1 as identical. Hence, you will not be able to deduce y as a function of x. Otherwise you wouldn't have these cancellation problems in the first place.
Magnus Skog
See the question: dy is always big, so the line will never be parallel to the x axis. In fact, dx <= dy.
chmike
I never said it would be either :) I said parallell to the y axis.
Magnus Skog
+3  A: 

To a large degree, your underlying problem is fundamental. When (x1-x0) is small, it means there are only a few bits in the mantissa of x1 and x0 which differ. And by extension, there are only a limted number of floats between x0 and x1. E.g. if only the lower 4 bits of the mantissa differ, there are at most 14 values between them.

In your best algorithm, the t term represents these lower bits. And to continue or example, if x0 and x1 differ by 4 bits, then t can take on only 16 values either. The calculation of these possible values is fairly robust. Whether you're calculating 3E0/14E0 or 3E-12/14E-12, the result is going to be close to the mathematical value of 3/14.

Your formula has the additional advantage of having y0 <= y <= y1, since 0 <= t <= 1

(I'm assuming that you know enough about float representations, and therefore "(x1-x0) is small" really means "small, relative to the values of x1 and x0 themselves". A difference of 1E-1 is small when x0=1E3 but large if x0=1E-6 )

MSalters
I fully agree with your analysis. x0 and x1 will differ only by a few less significant bits and x will be one value in this small float representable range. What conclusion can we draw from this ? There is no way to get a better result precision than by using the straightforward math implementation ? This would explain the observed results.
chmike
Yup, that's my conclusion. The values that you find for y would not significantly improve if you did the intermediate calculations on doubles - you'd hit the same limits when rounding the result back to float.
MSalters
A: 

How about computing something like:

t = sign * power2 ( sqrt (abs(x - x0))/ sqrt (abs(x1 - x0)))

The idea is to use a mathematical equivalent formula in which low (x1-x0) has less effect. (not sure if the one I wrote matches this criteria)

Liran Orevi
I added this expression to the test code and it gave the same result as the other methods in average error and stdDev. So no benefit. Beside it doesn't seem very efficient. Unless the compilers are smart enough to remove the sqrt.
chmike
Thanks, Regarding the average error, did you checked against error divided by the actual value ?There is a difference between 5.1 5.0 and 0.0 0.1, the later is a bigger error in proportion to the value.
Liran Orevi
How does this improve matters?
peterchen
I would be curious to try. Could you tell me what I should compute ? I have the difference, by what should I divide it ? The float value, the double precision value, a difference ?
chmike
+1  A: 

I have implemented a benchmark program to compare the effect of the different expression.

I computed y using double precision and then compute y using single precision with different expressions.

Here are the expression tested:

inline double getYDbl( double x, double x0, double y0, double x1, double y1 )
{
    double const t = (x - x0)/(x1 - x0);
    return y0 + t*(y1 - y0);
} 

inline float getYFlt1( float x, float x0, float y0, float x1, float y1 )
{
    double const t = (x - x0)/(x1 - x0);
    return y0 + t*(y1 - y0);
} 

inline float getYFlt2( float x, float x0, float y0, float x1, float y1 )
{
    double const t = (x - x0)*(y1 - y0);
    return y0 + t/(x1 - x0);
} 

inline float getYFlt3( float x, float x0, float y0, float x1, float y1 )
{
    double const t = (y1 - y0)/(x1 - x0);
    return y0 + t*(x - x0);
} 

inline float getYFlt4( float x, float x0, float y0, float x1, float y1 )
{
    double const t = (x1 - x0)/(y1 - y0);
    return y0 + (x - x0)/t;
}

I computed the average and stdDev of the difference between the double precision result and single precision result.

The result is that there is none on the average over 1000 and 10K random value sets. I used icc compiler with and without optimization as well as g++.

Note that I had to use the isnan() function to filter out bogus values. I suspect these result from underflow in the difference or division.

I don't know if the compilers rearrange the expression.

Anyway, the conclusion from this test is that the above rearrangements of the expression have no effect on the computation precision. The error remains the same (on average).

chmike
For a better comparison, take all inputs and return values as floats; up/downcast to double as appropriate. Your question is whether the limited precision is due to the calculation with floats. I and others contend it's the limited accuracy of the input itself, not of the calculation, that you are seeing.
MSalters
Have you tried using values in which (x1 - x0) is small?
Liran Orevi
Additionally, which optimization settings have you used? The optimizer might reorder expression evaluation order.
peterchen
Sry, can't keep track of all the comments.Oops, looking at the code I copied here I see that I declare all t as double precision values which is a mistake. It should be all float or all double.
chmike
After changing all double const t to float const t, there is no difference when computing the average over 10K values.For icc the optimization setting is icc -xS -O3. For g++ I tried with none. No difference.
chmike
+2  A: 

You may have a look at Qt's "QLine" (if I remember it right) sources; they have implemented an intersection determination algorithm taken from one the "Graphics Gems" books (the reference must be in the code comments, the book was on EDonkey a couple of years ago), which, in turn, has some guarantees on applicability for a given screen resolution when calculations are performed with given bit-width (they use fixed-point arithmetics if I'm not wrong).

mlvljr
A: 

As MSalters said, the problem is already in the original data.

Interpolation / extrapolation requires the slope, which already has low accuracy in the given conditions (worst for very short line segments far away from the origin).

Choice of algorithm canot regain this accuracy loss. My gut feeling is that the different evaluation order will not change things, as the error is introduced by the subtractions, not the devision.


Idea:
If you have more accurate data when the lines are generated, you can change the representation from ((x0, y0), (x1, y1)) to (x0,y0, angle, length). You could store angle or slope, slope has a pole, but angle requires trig functions... ugly.

Of course that won't work if you need the end point frequently, and you have so many lines that you can't store additional data, I have no idea. But maybe there is another representation that works well for your needs.

doubles have enough resolution in most situations, but that would double the working set too.

peterchen
I agree with your point on the input data precision. But the question was addressing the computation process. Would the described computation method spoil the reduced precision we have ?
chmike