+9  A: 

Keep track of nearby balls -

Just like the insertion sort is optimal due to the minimal change per frame, you can use the same property to keep track ball 'neighbors'

Each ball can only interact with a possible 6 other balls at the most. You can probably run an algorithm every 10 frames or so that figures out which neighbors each ball has, and then use that information for the next ten frames to calculate the actual collisions.

You can also follow the vectors of each ball, draw virtual lines, and see which lines cross in the next 3-5 frames and only calculate collisions that could possibly happen (even though few will happen due to time).

Just as you sorted by the x axis you can 'group' balls in subdivisions inside the main window. When a ball is inside a subdivision by at least one ball diameter, it only need to look at the balls in the same subdivision. If it's closer to a border or two you need to look at one or two other subdivisions, but it should be fewer calculations.

Further, those subdivisions can be dynamically located so the average subdivision has only 100 balls - there's not need to have subdivisions of the same size with different numbers of balls.

Depending on how crowded a subdivision is (balls per square unit) you might try different algorithms - a sparsely filled box only needs vector calculations to predict collisions, whereas a densely packed subdivision might only need some sort of optimized hexagraphic calculation. Sparse boxes may not need collision detection for many frames (since overlap won't be noticed as much as in densely populated boxes).

The energy density of a given box can be determined - a very sparse box with low energy balls needs fewer collision calculations than a sparse box with high energy.

Adam Davis
A: 

I think its time to measure performance to verify where the bottleneck really is. You didn't need to do measurement earlier because there was an obvious algorithm problem. Now there is still room to improve the algorithm, but are you sure thats the biggest problem? Measure how many comparisons you do per ball now. Is it something small? If so then algorithmic changes may not be the best next step.

Compare how long it takes to determine every balls position vs. the time it takes to actually draw them. It could be the later is now the bottleneck, and you should focus your efforts on the rendering code rather than the physics engine.

Frank Schwieterman
I've profiled it, I should of included this in my question. Collision Detection was about 55% of my time, rendering was about 45%. Aproximately.
Simucal
+4  A: 

The hard core physics engines use vectorization of floats, which gives a x16 boost on current hardware if lucky, and way more on specialized hardware. Larrabee for example can handle 1024 simultaneous calculations for a theretocal x1024 boost in math processing (but it needs this because it's also GPU)

Haven't looked at the code yet, but are you using math optimizations like fast square root and binary absolutes, or bitwise sign flipping? These things alone don't help, but when your churning large amounts of data, these techniques reroute some math to the main CPU, freeing the FPU, basically giving you a larger throughput.

Also GCC's SIMD code generation litteraly sucks, I've seen upto a 16 fold increase by using VC or IntelCompiler, that means, if you've paid attention, that GCC wasn't using any SIMD instructions at all!

Also those allegded 10k collisions aren't in such close quarters as you sim, so it's not directly comparable.

Robert Gould
The latest GCC appears to have a -ftree-vectorize option (implied by -O3) and a related -ftree-vectorizer-verbose=n option to provide more information when it doesn't work. Or you could just explictly invoke SIMD features with the compiler built-ins (first google hit for "gcc simd")
Wim Coenen
A: 

Maybe the problem is that there are so many interactions as the balls "pile up"? If I were doing this, I'd try the following:

  • turn gravity to 0 so that many collisions are not occuring simultaneously
  • profile your collission detection algorithm - if you are using a sorted array of balls and only analyzing the 6 or 7 that are nearest, then you shouldn't have any problems ... that's only 8000 or so collission checks per cycle assuming 800 balls, which isn't very many
Jess
+2  A: 

It's starting to sound almost like a slightly compressible fluid in a gravitational field at this point. Computational fluid dynamics ideas, using an Eulerian viewpoint, might help. If you create a grid with a scale such that only one ball at a time can occupy each cell, you can write conservation of mass, momentum, and energy balances for each cell and track the motion of the balls that way.

duffymo
+10  A: 

The simple way is don't test Object vs Object collisions, populate an array with the center point of each ball. Then, from each center scan a a square of size 4*radius centered on that point (you can optimize this a bit by not using a square, at the expense of making the code more complex). If there's another center in this square, only then do you check to see if they're within 2*radius of each other (and therefore colliding). You can further optimize this by reducing the resolution, and rounding the ball position, reducing the number of checks you need to do.

Another way, is to divide your space into a grid, and store the objects in grid regions. You only need to check for collisions between objects in adjacent grids.

patros
There's a pretty neat explanation at http://www.metanetsoftware.com/technique/tutorialB.html.
Nikhil Chelliah
+3  A: 

Once a ball is completely surrounded by other balls stop considering it for collision detection. Just from looking at your screenshot it seems that only the "surface" balls should be considered. Why check the ball 6 balls deep that nothing can possibly collide with? This would greatly reduce the number of potential collisions.

+2  A: 

I've been following the development of Chipmunk since it began and from my understanding the answer to optimizing is in the data structures. (isn't it always?)...

The data structure Chipmunk uses to achieve this is a Spacial Hash.

Nick
A: 

I've done something very much like this on the iPhone, and it uses the accelerometer to allow you to tilt the balls around, and the touch screen to add and delete balls. It can handle at least 30 balls before it starts to bog down noticeably.

One optimization I did early on is to inline the math. Originally I had a separate "vector" class, and it could only handle 10-12 balls before it turned into a slide show. Profiling showed it was spending a lot of time allocating and deallocating vectors.

Also, I don't separate the balls once they hit, I just bounce the vectors. They don't seem to ever overlap, and when they do it's pretty obvious that it's because they all got jammed together on the bottom.

I'm not quite ready to release the code yet, I've got some polishing to do, then I'll put it on the store.

GoatRider
+1  A: 

Try this:

Divide your rectangle up into N*M squares, such that the squares are slightly wider than the radius of a ball. It might be a good idea to have the squares overlap the edges of your rectangle, rather than fit neatly within it.

Make an array of BitSet. Don't use an Bitset[M][N], just new Bitset[M*N] - a little multiplication isn't going to hurt you.

Identify each ball with a number. When you position a ball at a location, set the bit in the bitset for that square and the 8 squares around it (to make this easier, extend your array of squares so that they extend one beyond the edge of the rectangle - that way you don't have to clip.)

Run through the squares. For each pair of balls in each square mark that pair as being a potential collision. To do this, create a bitset and - given that you have H balls and balls A and B occupy the same square - set the bits A+B*H and A*H+B.

Looing through the potential collisions is now easy, because BitSet includes a method that says "find me the next bit after this one that is set". Remember that each bit is double counted, so when bit Q is detected as being set, be sure to unset bit (Q%H)*H + (Q/H) - which is the other bit of the pair.

Alternatively: you can collapse that array of collisions pretty easily. A collision between A and B - given that A > B can be marked by setting bit A * (A-1) / 2 + B. This has the advantage that you don't need to care about the total number of balls.

Actually: forget it. just use this class which I wrote as an exercise:

import java.util.BitSet;
import java.util.Iterator;
import java.util.NoSuchElementException;

public class PairSet extends BitSet implements
    Iterable<PairSet.Pair> {
  public static class Pair implements Comparable<Pair> {
    public final int a;
    public final int b;

    private Pair(int a, int b) {
      if (a < 0 || b < 0 || a == b) { throw new IllegalArgumentException(
          "Pair(" + a + "," + b + ")"); }
      if (a > b) {
        this.a = a;
        this.b = b;
      } else {
        this.a = b;
        this.b = a;
      }
    }

    public String toString() {
      return "Pair(" + a + "," + b + ")";
    }

    public int hashCode() {
      return a * (a - 1) / 2 + b;
    }

    public boolean equals(Object o) {
      return o instanceof Pair
          && hashCode() == ((Pair) o).hashCode();
    }

    public int compareTo(Pair o) {
      return hashCode() - o.hashCode();
    }

  }

  PairSet() {}

  PairSet(BitSet z) {
    or(z);
  }

  PairSet(Iterable<Pair> z) {
    for (Pair p : z)
      set(p);
  }

  public void set(Pair p) {
    set(p.a, p.b);
  }

  public void clear(Pair p) {
    clear(p.a, p.b);
  }

  public void set(int a, int b) {
    if (a < 0 || b < 0 || a == b) { throw new IllegalArgumentException(
        "add(" + a + "," + b + ")"); }
    if (a > b) {
      set(a * (a - 1) / 2 + b);
    } else {
      set(b * (b - 1) / 2 + a);
    }
  }

  public void clear(int a, int b) {
    if (a < 0 || b < 0 || a == b) { throw new IllegalArgumentException(
        "add(" + a + "," + b + ")"); }
    if (a > b) {
      clear(a * (a - 1) / 2 + b);
    } else {
      clear(b * (b - 1) / 2 + a);
    }
  }

  public Iterator<Pair> iterator() {
    return new Iterator<Pair>() {
      int at       = -1;
      int triangle = 0;
      int a        = 0;

      public boolean hasNext() {
        return nextSetBit(at + 1) != -1;
      }

      public Pair next() {
        int nextat = nextSetBit(at + 1);
        if (nextat == -1) { throw new NoSuchElementException(); }
        at = nextat;
        while (triangle <= at) {
          triangle += a++;
        }
        return new Pair(a - 1, at - (triangle - a) - 1);

      }

      public void remove() {
        throw new UnsupportedOperationException();
      }
    };
  }
}

And that will nicely keep track of your potential collisions. The psudeocode is then

SW = width of rectangle
SH = height of rectangle
R = radius of balls + 1 // +1 is a fudge factor.

XS = number of squares across = SW/R + 4; // the +4 adds some slop
YS = number of squares hight = SH/R + 4; // the +4 adds some slop

int sx(Point2D.Float p) // the square into which you put a ball at x
   // never returns a number < 1
 := (int)((p.x-R/2)/R) + 2;

int sy(Point2D.Float p) // the square into which you put a ball at y
   // never returns a number < 1
 := (int)((p.y-R/2)/R) + 2;

Bitset[] buckets = new BitSet[XS*YS];
{for(int i: 0; i<buckets.length; i++) bukets[i] = new BitSet();}

BitSet bucket(int x, int y) {return bucket[y*XS + x]}
BitSet bucket(Point2D.Float p) {return bucket(sy(p),sx(p));}

void move(int ball, Point2D.Float from, Point2D.Float to) {
  if bucket(from) == bucket(to) return;
  int x,y;
  x = sx(from); y=sy(from);
  for(int xx==-1;xx<=1; xx++)
  for(int yy==-1;yy<=1; yy++)
  bucket(sx+xx, sy+yy).clear(ball);
  x = sx(to); y=sy(to);
  for(int xx==-1;xx<=1; xx++)
  for(int yy==-1;yy<=1; yy++)
  bucket(sx+xx, sy+yy).set(ball);
} 

PointSet findCollisions() {
    PointSet pp = new PointSet();
    for(BitSet bb: buckets) {
    int a;
    int prev_a;
    for(prev_a = -1; (a = bb.nextSetBit(prev_a+1))!=-1; prev_a=a) {
      int b;
      int prev_b;
      for(prev_b = a; (b = bb.nextSetBit(prev_b+1))!=-1; prev_b=b) {
        pp.add(a,b);
      }
    }
    return pp;
}
paulmurray
Actually - you could get rid of allocaing and deallocating "Pair" objects by getting an iterator that you instantate with a reference to a mutable Pair object. It's job is to fill in the x and y of the pair it points at.
paulmurray
+3  A: 

Your main problem is your collision resolution algorithm -- you can speed up the drawing and the collision detection, but that won't stop your balls collapsing into each other. The problem you have is much more difficult to solve cleanly than it appears; it is possible to do it right, though.

In your case, in the configuration that fails (big bin-o'-balls) your objects should really be sliding against each other, not bouncing. Sliding is a continuous constraint, which is different from the kind of impulse-based constraints your implementation handles.

The simplest thing you can do to prevent the collapse is to recheck all colliding objects to make sure they aren't interpenetrating after you do the collision handling -- potentially iterating as many times as necessary until none of them violate the constraints. This will of course take longer -- possibly much longer (and even raises the possibility of an infinite loop, though maybe not in that particular situation...).

There are good algorithms that will make your simulation behave, but they are more complicated than your impulse-based system. Generally, they involve considering mutually interacting bodies (like the balls in the bin) as a whole, and adjusting their collective configuration to satisfy their mutual constraints.

You should look for works (books, papers, web sites) about multibody simulation. I can't say which is likely to be most useful for your purposes; if I were you, I would start by visiting a good university library and looking through whatever books they have on the subject. You should be prepared for some serious math; if terms like "Lagrange multipliers" make you break out in hives, look out 8^). Basically, if you go this route, you will likely learn a lot of math, and no small amount of physics.

Yes, I agree with what you say. I've looked into how various popular physics engines handle it and it can get quite complex. After pouring over the continuous collision detection code in Box2D I could get a general understanding of what they were doing but I would need time to work through the math
Simucal
Thanks for your input, and a great first answer on SO! +1
Simucal
A: 

Unless I've missed something, the balls overlapping are the result of a bug, not inefficient code. Inefficient code would just cause the animation to run slow.

I'd suggest the problem is due to the iterative approach of the ball to ball collision detection. Currently it looks like you're only considering the result for a ball to ball collision. When multiple balls are touching a single ball the results seem to be undefined, and the chances of this happening increase with larger gravity or number of balls.

Pool
It isn't a bug. There are simply different approaches. One is where you used fixed increment time steps and during that time step you might of pushed beyond the point too balls overlap. If you want to use some more advanced calculus you can do things to calculate the exact moment a ball would collide and proceed from there.
Simucal
It still shouldn't be possible that faster execution of code changes the resulting effect, though. Or do you intend to reduce the fixed incremental time once your code is more optimal?
Pool