views:

236

answers:

3

I am building a CAD-file converter on top of two libraries (Opencascade and DWF Toolkit).

However, my question is plattform agnostic:

Given:

I have generated a mesh as a list of triangular faces form a model constructed through my application. Each Triangle is defined through three vertexes, which consist of three floats (x, y & z coordinate). Since the triangles form a mesh, most of the vertices are shared by more then one triangle.

Goal:

I need to find the list of unique vertices, and to generate an array of faces consisting of tuples of three indices in this list.

What i want to do is this:

//step 1: build a list of unique vertices
for each triangle
   for each vertex in triangle
      if not vertex in listOfVertices
         Add vertex to listOfVertices

//step 2: build a list of faces 
for each triangle
   for each vertex in triangle
      Get Vertex Index From listOfvertices
      AddToMap(vertex Index, triangle)

While I do have an implementation which does this, step1 (the generation of the list of unique vertices) is really slow in the order of O(n!), since each vertex is compared to all vertices already in the list. I thought "Hey, lets build a hashmap of my vertices' components using std::map, that ought to speed things up!", only to find that generating a unique key from three floating point values is not a trivial task.

Here, the experts of stackoverflow come into play: I need some kind of hash-function which works on 3 floats, or any other function generating a unique value from a 3d-vertex position.

+1  A: 

Three solutions. There are certainly others

  1. Use a hash map. This will only work if "same" means exactly the same
  2. Use a binary space partition to divide up the points
  3. Use a regular grid to divide up your points.

In cases 2 and 3 if you want to specify some tolerance you will need to search more than one part of the tree or grid. In the BSP case this means checking if you are within tolerance of the dividing plane, and if so recursing into both halves. In the grid case this means checking all neighbouring cells that are within tolerance. Neither of which is overly hard, but means it will be harder to use an "off the shelf" solution.

Michael Anderson
It took me a bit to get how space partitioning could help me - This is an interesting approach. +1
sum1stolemyname
A: 

The common idea to get hash is to multiply each bitpattern of a float with a prime and add them together. Something like that:

unsigned int hash_point(float x, float y, float z)
{
   unsigned int* px = (unsigned int*)&x;
   unsigned int* py = (unsigned int*)&y;
   unsigned int* pz = (unsigned int*)&z;

   return (*px)*PRIME1 + (*py)*PRIME2 + (*pz)*PRIME3;
}

You should note that sizeof(unsigned int) is considered equal to sizeof(float) here. Sample here is only to illustrate the main idea, you should tune it for your needs.

Kirill V. Lyadvinsky
When x, y and z are not integers, multiplying them by primes is not going to help much.
Tarydon
Correct. You should multiply bitpattern of floats, not floats itself.
Kirill V. Lyadvinsky
Beware that some numbers have more than one bit-pattern representation like 0 and -0 are the same yet have different bit patterns. This means they will hash to different values. This may or may not be a problem for your application though.
Michael Anderson
+2  A: 

Dump all the vertices in an array, then do unique(sort(array)). This should be O(k n log(n)), where k is the average number of triangles that share a vertex, usually k<7.

The only caveat I can think of is that your unique function should be able to take a pointer to a comparison function, since you probably want to consider your vertices equal if

distance(vertex1, vertex2) < threshold

but that seems to be OK.

AVB
This approach works. I implemeted this using an STL list. +1
sum1stolemyname
And it is _much_ more efficient that way. (35 seconds with my first approach, 1.5 seconds with the refined approach)
sum1stolemyname
There's a nasty bug in this if you want to catch points within threshold distance of each other. There's no guarantee that points close together "sort" near to each other. In fact I think you can prove that for any sort function there are points arbitrarily close together that sort to distant locations in the list... Not good.
Michael Anderson
@Michael: go on and prove it, then
AVB
If you just need a case for a naive simple sort. Consider that this array of 2D points is in the order that conventional sort would return { [1,1], [1,10], [1.0001,1] } Yet [1.0001,1] is closer to [1,1] than [1,10]. But due to the separation in the list unique will miss it. ( If you really want a proof of the general case for an arbitrary sort function I can probably construct one , but it will be complicated - and you will probably need a degree in math to understand it. )
Michael Anderson