views:

78

answers:

2

I'm trying to implement an octree, and for that, I need a fast AABB-ray intersection algorithm. After some searching, I came across this paper that seemed to offer that. From the source code, available here, I translated the pluecker_cls_cff function to C# as this:

public bool Intersect_2(ref RayPluecker r)
{
  switch (r.Classification)
  {

    // 7 same-ish cases snipped

    case Classification.PPP:

      return !((r.Position.X > this.Max.X) || (r.Position.Y > this.Max.Y) || (r.Position.Z > this.Max.Z) ||
        (r.PlueckerCoefficient.X + r.Direction.X * this.Max.Y - r.Direction.Y * this.Min.X < 0) ||
        (r.PlueckerCoefficient.X + r.Direction.X * this.Min.Y - r.Direction.Y * this.Max.X > 0) ||
        (r.PlueckerCoefficient.Y + r.Direction.X * this.Min.Z - r.Direction.Z * this.Max.X > 0) ||
        (r.PlueckerCoefficient.Y + r.Direction.X * this.Max.Z - r.Direction.Z * this.Min.X < 0) ||
        (r.PlueckerCoefficient.Z - r.Direction.Z * this.Min.Y + r.Direction.Y * this.Max.Z < 0) ||
        (r.PlueckerCoefficient.Z - r.Direction.Z * this.Max.Y + r.Direction.Y * this.Min.Z > 0));
  }

  return false;
}

This seems to work fine, but it seemed fairly slow to me (250ms to do 10 million intersects) so I tried some micro-benchmarking with different varieties. In one, I removed the negation that is right after the return statement and reversed all comparisons (> to < and visa versa).

It's now:

case Classification.PPP:

      return ((r.Position.X < this.Max.X) || (r.Position.Y < this.Max.Y) || (r.Position.Z < this.Max.Z) ||
        (r.PlueckerCoefficient.X + r.Direction.X * this.Max.Y - r.Direction.Y * this.Min.X > 0) ||
        (r.PlueckerCoefficient.X + r.Direction.X * this.Min.Y - r.Direction.Y * this.Max.X < 0) ||
        (r.PlueckerCoefficient.Y + r.Direction.X * this.Min.Z - r.Direction.Z * this.Max.X < 0) ||
        (r.PlueckerCoefficient.Y + r.Direction.X * this.Max.Z - r.Direction.Z * this.Min.X > 0) ||
        (r.PlueckerCoefficient.Z - r.Direction.Z * this.Min.Y + r.Direction.Y * this.Max.Z > 0) ||
        (r.PlueckerCoefficient.Z - r.Direction.Z * this.Max.Y + r.Direction.Y * this.Min.Z < 0));

This should give the same result, right? It seemed so, as it returns the same results as the negated version with a couple of test cases. However, in the benchmark, it was 5x faster (50ms to do 10 million intersects)! I'm sure it wasn't being optimized out, my benchmark looks like this:

for (int i = 0; i < 10000000; i++)
{
  if (!box.Intersect_3(ref ray))
  {
    throw new Exception();
  }
}

What can explain this huge difference? I'm running .NET 4.0 on x86.

+5  A: 

Your second code doesn't do the same thing as your first.

In addition to the changes you already made, you need to turn all your ORs into ANDs. (See De Morgan's Laws.)

I'll bet that after you make the fix, your two versions will run at the same speed.

Jon Rodriguez
++ Elegant observation.
Mike Dunlavey
That, and also the negation of `>` is `<=`, not `<`.
Timwi
Yup, you're right. Thanks! And then you thought you knew boolean logic.. It was just too good to be true, I guess, because the algorithm doesn't seem that fast (40m intersects/s on a single core of an Core 2 Duo E8400) but that might just be perception.
JulianR
I guess I'll try some of the other variations, that don't have as many pre-computed fields stored in the Ray, the paper is 7 years old already and the cost of a bigger object in memory with respect to cache-misses might not offset the small gain by pre-computing fields these days as the gap between memory and cpu speed widened.
JulianR
A: 

Specifically related to performance, I bet the return statement is short-circuiting sooner in the second case than the first. It may be worth trying to change the order of the comparisons if some are more likely than others to be true. If you change the calculations to be equivalent using && instead of || in the second case, then you would want the ones that are most likely to be false to come first.

Alex Morris