views:

430

answers:

7

When comparing two variants of pointers (classic/shared_pt) I was surprised by a significant increase of the running speed of the program. For testing 2D Delaunay incremental Insertion algorithm has been used.

Compiler settings VS 2010 (release) /O2 /MD /GL, W7 Prof, CPU 3.GHZ DualCore

Results:

shared_ptr (C++ 0x00):

N[points] t[sec]
100 000 6
200 000 11
300 000 16
900 000 36

Pointers:

N[points] t[sec]
100 000 0,5
200 000 1
300 000 2
900 000 4

Running time of the shared_ptr versions is approx. 10 times longer... Is this caused by the compiler settings or C++ 0x00 shared_ptr implementation is so slow?

VS2010 Profiler: For raw pointers approx 60% of the time is spent by heuristic searching of the triangle containing inserted point (it is OK, it is a well-known fact). But for the shared_ptr version approx 58% of the time is spent using shared_ptr.reset() and only 10% is used for heuristic searching.

Testing code with raw pointers:

void DT2D::DT ( Node2DList *nl, HalfEdgesList *half_edges_dt, bool print )
{
    // Create 2D Delaunay triangulation using incremental insertion method
    unsigned int nodes_count_before = nl->size();

    // Remove duplicit points
    nl->removeDuplicitPoints();

    // Get nodes count after deletion of duplicated points
    unsigned int nodes_count_after = nl->size();

    //Print info
    std::cout << "> Starting DT, please wait... ";
    std::cout << nodes_count_after << " points, " << ( nodes_count_before - nodes_count_after ) << " removed.";

    // Are in triangulation more than three points
    try
    {
            //There are at least 3 points
            if ( nodes_count_after > 2 )
            {
                    // Create simplex triangle
                    createSimplexTriangle ( nl, half_edges_dt );

                    // Increment nodes count
                    nodes_count_after += 3;

                    // Starting half edge using for searching
                    HalfEdge *e_heuristic = ( *half_edges_dt ) [0];

                    // Insert all points into triangulation using incremental method
                    for ( unsigned int i = 3; i < nodes_count_after; i++ )  // Jump over simplex
                    {
                            DTInsertPoint ( ( *nl ) [i], &e_heuristic, half_edges_dt );
                    }

                    //Corect boundary triangles (swap edges in triangles adjacent to simplex triangles).
                    //They are legal due to DT, but not creating the convex hull )
                    correctBoundaryTriangles ( nl, half_edges_dt );

                    // Remove triangles having simplex points
                    removeSimplexTriangles ( nl, half_edges_dt );
            }

            //Print results
            std::cout << " Completed." << std::endl;
    }

Insert point procedure:

void DT2D::DTInsertPoint ( Point2D *p, HalfEdge **e1, HalfEdgesList *half_edges_dt )
{
    // One step of the Delaunay triangulation, incremental insertion by de Berg (2001)
    short   status = -1;

    //Pointers
    HalfEdge *e31 = NULL;
    HalfEdge *e21 = NULL;
    HalfEdge *e12 = NULL;
    HalfEdge *e32 = NULL;
    HalfEdge *e23 = NULL;
    HalfEdge *e13 = NULL;
    HalfEdge *e53 = NULL;
    HalfEdge *e44 = NULL;
    HalfEdge *e63 = NULL;

    try
    {
            // Test, if point lies inside triangle
            *e1 = LawsonOrientedWalk::findTriangleWalk ( p, &status, *e1, 0 );

            if ( e1 != NULL )
            {
                    // Edges inside triangle lies the point
                    HalfEdge *e2 = ( *e1 )->getNextEdge();
                    HalfEdge *e3 = e2->getNextEdge();

                    // Point lies inside the triangle
                    if ( status == 1 )
                    {
                            // Create first new triangle T1, twin edges set after creation
                            e31 = new HalfEdge ( p, *e1, NULL );
                            e21 = new HalfEdge ( e2->getPoint(), e31, NULL );
                            ( *e1 )->setNextEdge ( e21 );

                            // Create second new triangle T2, twin edges set after creation
                            e12 = new HalfEdge ( p, e2, NULL );
                            e32 = new HalfEdge ( e3->getPoint(), e12, NULL );
                            e2->setNextEdge ( e32 );

                            // Create third new triangle T3, twin edges set after creation
                            e23 = new HalfEdge ( p, e3, NULL );
                            e13 = new HalfEdge ( ( *e1 )->getPoint(), e23, NULL );
                            e3->setNextEdge ( e13 );

                            // Set twin edges in T1, T2, T3
                            e12->setTwinEdge ( e21 );
                            e21->setTwinEdge ( e12 );
                            e13->setTwinEdge ( e31 );
                            e31->setTwinEdge ( e13 );
                            e23->setTwinEdge ( e32 );
                            e32->setTwinEdge ( e23 );

                            // Add new edges into list
                            half_edges_dt->push_back ( e21 );
                            half_edges_dt->push_back ( e12 );
                            half_edges_dt->push_back ( e31 );
                            half_edges_dt->push_back ( e13 );
                            half_edges_dt->push_back ( e32 );
                            half_edges_dt->push_back ( e23 );

                            // Legalize triangle T1
                            if ( ( *e1 )->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, *e1 );
                            }

                            // Legalize triangle T2
                            if ( e2->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e2 );
                            }

                            // Legalize triangle T3
                            if ( e3->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e3 );
                            }
                    }

                    // Point lies on the edge of the triangle
                    else if ( status == 2 )
                    {
                            // Find adjacent triangle
                            HalfEdge *e4 = ( *e1 )->getTwinEdge();
                            HalfEdge *e5 = e4->getNextEdge();
                            HalfEdge *e6 = e5->getNextEdge();

                            // Create first new triangle T1, twin edges set after creation
                            e21 = new HalfEdge ( p, e3, NULL );
                            ( *e1 )->setNextEdge ( e21 );

                            // Create second new triangle T2, OK
                            e12 = new HalfEdge ( p, e2, e4 );
                            e32 = new HalfEdge ( e3->getPoint(), e12, e21 );
                            e2->setNextEdge ( e32 );

                            // Create third new triangle T3, twin edges set after creation
                            e53 = new HalfEdge ( p, e6, NULL );
                            e4->setNextEdge ( e53 );

                            // Create fourth new triangle T4, OK
                            e44 = new HalfEdge ( p, e5, *e1 );
                            e63 = new HalfEdge ( e6->getPoint(), e44, e53 );
                            e5->setNextEdge ( e63 );

                            // Set twin edges in T1, T3
                            e21->setTwinEdge ( e32 );
                            ( *e1 )->setTwinEdge ( e44 );
                            e53->setTwinEdge ( e63 );
                            e4->setTwinEdge ( e12 );

                            // Add new edges into list
                            half_edges_dt->push_back ( e21 );
                            half_edges_dt->push_back ( e12 );
                            half_edges_dt->push_back ( e32 );
                            half_edges_dt->push_back ( e53 );
                            half_edges_dt->push_back ( e63 );
                            half_edges_dt->push_back ( e44 );

                            // Legalize triangle T1
                            if ( e3->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e3 );
                            }

                            // Legalize triangle T4
                            if ( e5->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e5 );
                            }

                            // Legalize triangle T3
                            if ( e6->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e6 );
                            }

                            // Legalize triangle T2
                            if ( e2->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e2 );
                            }
                    }
            }
    }
    //Throw exception
    catch ( std::bad_alloc &e )
    {
            //Free memory
            if ( e31 != NULL ) delete e31;
            if ( e21 != NULL ) delete e21;
            if ( e12 != NULL ) delete e12;
            if ( e32 != NULL ) delete e32;
            if ( e23 != NULL ) delete e23;
            if ( e13 != NULL ) delete e13;
            if ( e53 != NULL ) delete e53;
            if ( e44 != NULL ) delete e44;
            if ( e63 != NULL ) delete e63;

            //Throw exception
            throw ErrorBadAlloc ( "EErrorBadAlloc: ", "Delaunay triangulation: Can not create new triangles for inserted point p." );
    }

    //Throw exception
    catch ( ErrorMathZeroDevision &e )
    {
            //Free memory
            if ( e31 != NULL ) delete e31;
            if ( e21 != NULL ) delete e21;
            if ( e12 != NULL ) delete e12;
            if ( e32 != NULL ) delete e32;
            if ( e23 != NULL ) delete e23;
            if ( e13 != NULL ) delete e13;
            if ( e53 != NULL ) delete e53;
            if ( e44 != NULL ) delete e44;
            if ( e63 != NULL ) delete e63;

            //Throw exception
            throw ErrorBadAlloc ( "EErrorMathZeroDevision: ", "Delaunay triangulation: Can not create new triangles for inserted point p." );
    }
}

Testing code with shared_ptr: Code wa rewriten without any optimalization...

void DT2D::DTInsertPoint ( std::shared_ptr <Point2D> p, std::shared_ptr <HalfEdge> *e1, HalfEdgesList * half_edges_dt )
{
    // One step of the Delaunay triangulation, incremental insertion by de Berg (2001)
    short   status = -1;

    //Pointers
    std::shared_ptr <HalfEdge> e31;
    std::shared_ptr <HalfEdge> e21;
    std::shared_ptr <HalfEdge> e12;
    std::shared_ptr <HalfEdge> e32;
    std::shared_ptr <HalfEdge> e23;
    std::shared_ptr <HalfEdge> e13;
    std::shared_ptr <HalfEdge> e53;
    std::shared_ptr <HalfEdge> e44;
    std::shared_ptr <HalfEdge> e63;

    try
    {
            // Test, if point lies inside triangle
            *e1 = LawsonOrientedWalk::findTriangleWalk ( p, &status, *e1, 0 );

            if ( e1 != NULL )
            {
                    // Edges inside triangle lies the point
                    std::shared_ptr <HalfEdge> e2((*e1 )->getNextEdge());
                    std::shared_ptr <HalfEdge> e3(e2->getNextEdge());

                    // Point lies inside the triangle
                    if ( status == 1 )
                    {
                            // Create first new triangle T1, twin edges set after creation
            e31.reset( new HalfEdge ( p, *e1, NULL ));
                            e21.reset( new HalfEdge ( e2->getPoint(), e31, NULL ));
                            ( *e1 )->setNextEdge ( e21 );

                            // Create second new triangle T2, twin edges set after creation
                            e12.reset( new HalfEdge ( p, e2, NULL ));
                            e32.reset( new HalfEdge ( e3->getPoint(), e12, NULL ));
                            e2->setNextEdge ( e32 );

                            // Create third new triangle T3, twin edges set after creation
                            e23.reset( new HalfEdge ( p, e3, NULL ));
                            e13.reset( new HalfEdge ( ( *e1 )->getPoint(), e23, NULL ));
                            e3->setNextEdge ( e13 );

                            // Set twin edges in T1, T2, T3
                            e12->setTwinEdge ( e21 );
                            e21->setTwinEdge ( e12 );
                            e13->setTwinEdge ( e31 );
                            e31->setTwinEdge ( e13 );
                            e23->setTwinEdge ( e32 );
                            e32->setTwinEdge ( e23 );

                            // Add new edges into list
                            half_edges_dt->push_back ( e21 );
                            half_edges_dt->push_back ( e12 );
                            half_edges_dt->push_back ( e31 );
                            half_edges_dt->push_back ( e13 );
                            half_edges_dt->push_back ( e32 );
                            half_edges_dt->push_back ( e23 );

                            // Legalize triangle T1
                            if ( ( *e1 )->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, *e1 );
                            }

                            // Legalize triangle T2
                            if ( e2->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e2 );
                            }

                            // Legalize triangle T3
                            if ( e3->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e3 );
                            }
                    }

                    // Point lies on the edge of the triangle
                    else if ( status == 2 )
                    {
                            // Find adjacent triangle
                            std::shared_ptr <HalfEdge> e4 = ( *e1 )->getTwinEdge();
                            std::shared_ptr <HalfEdge> e5 = e4->getNextEdge();
                            std::shared_ptr <HalfEdge> e6 = e5->getNextEdge();

                            // Create first new triangle T1, twin edges set after creation
                            e21.reset(new HalfEdge ( p, e3, NULL ));
                            ( *e1 )->setNextEdge ( e21 );

                            // Create second new triangle T2, OK
                            e12.reset(new HalfEdge ( p, e2, e4 ));
                            e32.reset(new HalfEdge ( e3->getPoint(), e12, e21 ));
                            e2->setNextEdge ( e32 );

                            // Create third new triangle T3, twin edges set after creation
                            e53.reset(new HalfEdge ( p, e6, NULL ));
                            e4->setNextEdge ( e53 );

                            // Create fourth new triangle T4, OK
                            e44.reset(new HalfEdge ( p, e5, *e1 ));
                            e63.reset(new HalfEdge ( e6->getPoint(), e44, e53 ));
                            e5->setNextEdge ( e63 );

                            // Set twin edges in T1, T3
                            e21->setTwinEdge ( e32 );
                            ( *e1 )->setTwinEdge ( e44 );
                            e53->setTwinEdge ( e63 );
                            e4->setTwinEdge ( e12 );

                            // Add new edges into list
                            half_edges_dt->push_back ( e21 );
                            half_edges_dt->push_back ( e12 );
                            half_edges_dt->push_back ( e32 );
                            half_edges_dt->push_back ( e53 );
                            half_edges_dt->push_back ( e63 );
                            half_edges_dt->push_back ( e44 );

                            // Legalize triangle T1
                            if ( e3->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e3 );
                            }

                            // Legalize triangle T4
                            if ( e5->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e5 );
                            }

                            // Legalize triangle T3
                            if ( e6->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e6 );
                            }

                            // Legalize triangle T2
                            if ( e2->getTwinEdge() != NULL )
                            {
                                    legalizeTriangle ( p, e2 );
                            }
                    }
            }
    }
    //Throw exception
    catch ( std::bad_alloc &e )
    {
    /*
            //Free memory
            if ( e31 != NULL ) delete e31;
            if ( e21 != NULL ) delete e21;
            if ( e12 != NULL ) delete e12;
            if ( e32 != NULL ) delete e32;
            if ( e23 != NULL ) delete e23;
            if ( e13 != NULL ) delete e13;
            if ( e53 != NULL ) delete e53;
            if ( e44 != NULL ) delete e44;
            if ( e63 != NULL ) delete e63;
    */
            //Throw exception
            throw ErrorBadAlloc ( "EErrorBadAlloc: ", "Delaunay triangulation: Can not create new triangles for inserted point p." );
    }

    //Throw exception
    catch ( ErrorMathZeroDevision &e )
    {
    /*
            //Free memory
            if ( e31 != NULL ) delete e31;
            if ( e21 != NULL ) delete e21;
            if ( e12 != NULL ) delete e12;
            if ( e32 != NULL ) delete e32;
            if ( e23 != NULL ) delete e23;
            if ( e13 != NULL ) delete e13;
            if ( e53 != NULL ) delete e53;
            if ( e44 != NULL ) delete e44;
            if ( e63 != NULL ) delete e63;
    */
            //Throw exception
            throw ErrorBadAlloc ( "EErrorMathZeroDevision: ", "Delaunay triangulation: Can not create new triangles for inserted point p." );
    }
}

Thanks for your help...

Edit
I replaced direct passing of all objects with alias passing &. Copy constructors are used less frequent then before.

Updated tables for shared_ptr

shared_ptr (C++ 0x00) old:

N[points] t[sec]
100 000 6
200 000 11
300 000 16
900 000 36

shared_ptr (C++ 0x00) new version:

N[points] t[sec]
100 000 2
200 000 5
300 000 9
900 000 24

There is a considerable improvement, but shared_pointer version is 4 times slower than raw pointer one... I am afraid, that running speed of the program can not be significantly increased...

+5  A: 

Try it in "release" mode and see if you get closer benchmarks. Debug mode tends to turn on lots of assertions in the STL which slow lots of things down.

Ken Simon
Furthermore, Debug version of the STL in MSVC has a global lock that prevents from getting the benefits of using multiple cores. See http://stackoverflow.com/questions/2832023/c-map-performance-linux-30-sec-vs-windows-30-mins/2832511#2832511
Didier Trosset
It was realease mode, /MDd represents debug mode...
Ian
/MD implies this is already release mode. /MDd would be Debug. I agree that profiling in Debug would be a waste of time but that appears not to be the case here.http://msdn.microsoft.com/en-us/library/2kzt1wy3(VS.80).aspx
Steve Townsend
Debug version of MSVC also has a horrible thing called `_HAS_ITERATOR_DEBUGGING` -- that slows down things *a lot*.
dirkgently
Ah, I totally forgot to look at your cflags. I'd go with what others said: make sure your pointers are actually getting deleted in your raw pointer code, or else you're comparing apples to oranges and basically saying "my memory-leaking code is faster than my non-leaking code."
Ken Simon
@dirkgently - iterator debugging has helped me catch a lot of bugs. I like it. You can always turn it off if the performance is too bad!
AshleysBrain
@Ken Simon: raw pointers were deleted and checked, there is no memory leak...
Ian
A: 

It's impossible to answer this without more data. Have you profiled the code to accurately identify the source of the slowdown in the shared_ptr version? Using the container will certainly add overhead but I'd be surprised if it makes it 10x slower.

VSTS has nice perf tools that will attribute the CPU usage exactly if you can run this for 30 secs or so. If you don't have access to the VS Performance Tools or other profiling toolset, then run the shared_ptr code in the debugger and break into it 10 or 15 times to get a brute force sample of where it's spending all its time. This is surprisingly and counter-intuitively effective, I have found.

[EDIT] Do not pass your shared_ptr by value in that variant of the code - use ref to const. If this function is called a lot this will have measurable -ve impact.

Steve Townsend
VS2010 Profiler: For raw pointers the most % of time is spend by Heuristic founding of the triangle. But for the shared_ptr version 58% of the time is spent using shared_ptr.reset()
Ian
How easy or otherwise is it for you to retry this analysis with boost::shared_ptr?
Steve Townsend
Ian
Great - now, where is shared_ptr.reset() getting called from? if that is (was?) 58% of your time, then reducing the number of calls should be a big win. Do you have updated analysis for tuning? v hard to help without full source code though.
Steve Townsend
+6  A: 

By default, if you create your shared pointers the naïve way (i.e. shared_ptr<type> p( new type )) you incur two memory allocations, one for the actual object and an extra allocation for the reference count. You can avoid the extra allocation by making use of the make_shared template that will perform a single instantiation for both the object and the reference count and then in-place construct the object.

The rest of the extra costs are quite small compared with doubling the calls to malloc, like incrementing and decrementing the count (both atomic operations) and testing for deletion. If you can provide some code in how you are using the pointers/shared pointers you might get a better insight as to what is actually going on in the code.

David Rodríguez - dribeas
+9  A: 

shared_ptr are the most complicated type of pointer ever:

  • Ref counting takes time
  • Multiple allocation (there are 3 parts: the object, the counter, the deleter)
  • A number of virtual methods (in the counter and the deleter) for type erasure
  • Works among multiple threads (thus synchronization)

There are 2 ways to make them faster:

  • use make_shared to allocate them, because (unfortunately) the normal constructor allocates two different blocks: one for the object and one for the counter and deleter.
  • don't copy them if you don't need to: methods should accept shared_ptr<T> const&

But there are also many ways NOT to use them.

Looking at your code it looks like your doing a LOT of memory allocation, and I can't help but wonder if you couldn't find a better strategy. I must admit I didn't got the full figure, so I may be heading straight into a wall but...

Usually code is much simpler if you have an owner for each of the objects. Therefore, shared_ptr should be a last resort measure, employed when you can't get a single owner.

Anyway, we're comparing apples and oranges here, the original code is buggy. You take care of deleting the memory (good) but you forgot that these objects were also referenced from other points in the program e1->setNextEdge(e21) which now holds pointers to destructed objects (in a free'd memory zone). Therefore I guess that in case of exception you just wipe out the entire list ? (Or somehow bet on undefined behavior to play nice)

So it's hard to judge on performances since the former doesn't recover well from exceptions while the latter does.

Finally: Have you thought about using intrusive_ptr ? It could give you some boost (hehe) if you don't synchronize them (single thread) and you would avoid a lot of stuff performed by the shared_ptr as well as gain on locality of reference.

Matthieu M.
+1  A: 

I always recommend using std::shared_ptr<> instead of relying on manual memory life-time management. However, automatic lifetime management costs something but usually not significant.

In your case you noticed shared_ptr<> is significant and as some said you should make sure that you don't unnecessarily copies a shared pointer as that force an addref/release.

But there's another question in the background: Do you really need to rely on new/delete in the first place? new/delete uses malloc/free which are not tuned for allocations of small objects.

A library that helped me alot before is boost::object_pool.

At some stage I wanted to create graphs very fast. Nodes and edges are naturally dynamically allocated and I get two costs from doing that.

  1. malloc/free
  2. Memory lifetime management

boost:object_pool helps reduce both these costs at the costs of not being as general as malloc/free.

So as an example let's say we have a simple node like this:

   struct node
   {
      node * left;
      node * right;
   };

So instead of allocation node with new I use boost::object_pool. But boost::object_pool also tracks all instance allocated with it so at the end of my calculation I destroyed object_pool and didn't need to track each node thus simplifying my code and improving the performance.

I did some performance testing (I wrote my own pool class just for fun but bool::object_pool should give the same performance or better).

10,000,000 nodes created and destroyed

  1. Plain new/delete: 2.5secs
  2. shared_ptr: 5secs
  3. boost::object_pool: 0.15secs

So if boost::object_pool works for you it might help reduce the memory allocation overhead significantly.

FuleSnabel
like that! I have used my own pools of fixed-block sized objects, which was super-fast, but this looks good.
gbjbaanb
+1  A: 

shared_ptr are noticeably slower than raw pointers. That's why they should only be used if you actually need shared ownership semantics.

Otherwise, there are several other smart pointer types available. scoped_ptr and auto_ptr (C++03) or unique_ptr (C++0x) both have their uses. And often, the best solution is not to use a pointer of any kind, and just write your own RAII class instead.

A shared_ptr has to increment/decrement/read the reference counter, and depending on the implementation and how it is instantiated, the ref counter may be allocated separately, causing potential cache misses. And it has to access the ref counter atomically, which adds additional overhead.

jalf
A: 

I am sending some comments using the answer form:

@jalf: I do not find scoped_ptr and auto_ptr as the best solution, there are problems with transferring of the ownership. Writing my own RAII class instead, it is a heavyweight solution with significant changes of the source code...

@FuleSnabel: Your solution is inspiring, I have never haeard about object_pool. I will try to test it... Unfortunately it is not a part of C++ 0x00 standard...

There is another testing code:

#include <vector>
#include <memory>

class Point2D
{
    protected:
            double x;
            double y;

public:
    Point2D(double xx, double yy) : x(xx), y(yy) {}
};

Points are stored in vector:

typedef std::vector <std::shared_ptr <Point2D> > TNodes2DListS;

class Node2DListS
{
    protected:
            TNodes2DListS nodes;

    public:
            Node2DListS() {}

    public:
            inline void push_back ( std::shared_ptr <Point2D> p ) {nodes.push_back ( p     );}
};

typedef std::vector <Point2D * > TNodes2DList;

//List of 2D nodes
class Node2DList
{
    protected:
            TNodes2DList nodes;

    public:
    Node2DList() {}

    ~Node2DList ()
    {
        for (unsigned int i = 0; i < nodes.size(); i++)
        {
            delete nodes[i];
        }

        nodes.clear();
    }

    public:
            //Overloaded member functions
            void clear();
            inline void push_back ( Point2D *p) {nodes.push_back ( p );}

};



int _tmain(int argc, _TCHAR* argv[])
{
const unsigned int POINTS = 1000000;
Node2DList nl;
Node2DListS nlS;

//Time: 5 sec
for (int i = 0; i < POINTS; i++)
{
    Point2D *p = new Point2D (i, i);
    nl.push_back(p);
}
return 0;

//Time 18 s 
for (int i = 0; i < POINTS; i++)
{
    std::shared_ptr <Point2D> p = std::make_shared <Point2D> (i,i);
    nlS.push_back(p);
}

//Time 19 s 
for (int i = 0; i < POINTS; i++)
{
    std::shared_ptr <Point2D> p (new Point2D(i,i));
    nlS.push_back(p);
}

return 0;
}

And results for shared_ptrs are still bad:

Raw pointers: 5s shared_ptr: make shared: 18s shared_ptr: new: 19s

I prefer speed to safety and I will probably use only raw pointers, because the program is significantly slowdown...

Comparision with Java: 6s: Java is about 3 times faster !

public class Point2D {
     protected double x, y;

 public Point2D(double xx, double yy)
     {
         this.x = xx;
         this.y = yy;
     }
}

Java implementation using ArrayList:

public class Points2DList {

protected ArrayList<Point2D> nodes;

public Points2DList() {
    nodes = new ArrayList<Point2D>();
}

public void add(Point2D p) {
    nodes.add(p);
}
}

public class Main {

  public static void main(String[] args)
  {
    int POINTS = 1000000;
    Points2DList pl = new Points2DList();

    for (int i = 0; i < POINTS; i++)
    {
      Point2D p = new Point2D (i, i);
      pl.add(p);
    }
   }
}
Ian
jalf
And to answer your comments, you can't have everything. Either write in a language which implements shared ownership efficiently (such as Java) *or* design your code so that shared ownership is unnecessary *or* live with the performance hit of shared ownership. Reading your code, I don't see why you have to work with dynamically allocated pointers *at all*. Why can't your half-edges be allocated on the stack, and copied into the half-edge list?
jalf
Calling "new" on everything and using pointers everywhere is a common mistake among Java programmers trying to use C++. Overuse of `shared_ptr` points in the same direction. It looks like you're trying to write Java code in C++, and that's a bad idea. If you're going to write Java code, do it in Java, where it'll at least run efficiently. C++ is a different language, and it just won't perform well if you don't embrace it and write C++ code instead of Java dressed up to compile as C++
jalf