views:

1149

answers:

15

I need to write a program (a project for university) that solves (approx) an NP-hard problem. It is a variation of Linear ordering problems. In general, I will have very large inputs (as Graphs) and will try to find the best solution (based on a function that will 'rate' each solution)

Will there be a difference if I write this in C-style code (one main, and functions) or build a Solver class, create an instance and invoke a 'run' method from a main (similar to Java)

Also, there will be alot of floating point math going on in each iteration.

Thanks!

+2  A: 

Not inherently, no.

Andrew Medico
@Andrew: We're trying to discourage the use of tags like 'offtopic', 'not-programming-related', and 'belongs-on-superuser'. They don't really help us categorize information. When editing tags please **do not** replace existing tags, since you're then removing useful information and replacing it with noise. Thank you.
Bill the Lizard
+51  A: 

No.

The biggest performance gains/flaws will be on the algorithm you implement, and how much unneeded work you perform (Unneeded work could be everything from recalculating a previous value that could have been cached, to using too many malloc/free's vs using memory pools, passing large immutable data by value instead of reference)

nos
If you look at the <a href="shootout.alioth.debian.org/">Language Shootout</a>, you'll see very little difference between C and C++.
mrjbq7
+1: The OP describes solving an NP hard problem, so the real bottleneck is computational complexity, not language speed. For even small problem sets, an O(log n) algorithm in Python is going to beat an O(n^2) algorithm in C everytime. The computational complexity is going to dwarf the tiny constant factor of performance gained by using language X instead of language Y.
Juliet
@Juliet: Actually, no. For small problem sets, the O(n^2) algorithm may be faster than an O(log n) or even an O(1) algorithm. For instance, associations of 10 or 15 elements, doing linear, O(n), search on a linked list is faster than using a hash table, which is O(1) (and if you problem consists of doing lots and lots of operations on small associations, linked lists may be much faster than using hash tables). The big-O complexity only talks about the performance characteristics on *large* data sets.
Brian Campbell
+1  A: 

Since both are compiled, and the compilers now are very good at how to handle C++, I think the only problem would come from how well optimized your code is. I think it would be easier to write slower code in C++, but that depends on which style your model fits into best.

When it comes down to it, I doubt there will be any real difference, assuming both are well-written, any libraries you use, how well written they are, if you are measuring on the same computer.

James Black
Its easier to write slower code in C++! If you are using GCC then both languages are compiled into the same intermediate language. It is this language that is then optimized and converted into machine language. Thats like saying its easier to write green code in C++.
Martin York
+1  A: 

As long as you don't use any virtual functions etc. you won't note any considerable performance differences. Early C++ was compiled to C, so as long as you know the pinpoints where this creates any considerable overhead (such as with virtual functions) you can clearly calculate for the differences.

In addition I want to note that using C++ can give you a lot to gain if you use the STL and Boost Libraries. Especially the STL provides very efficient and proven implementations of the most important data structures and algorithms, so you can save a lot of development time.

Effectively it also depends on the compiler you will be using and how it will optimize the code.

Johannes Rudolph
I'd even go so far as to argue there won't be any *considerable* performance differences *even* if he used virtual functions. :)
Isak Savo
I find Johannes' opinion of virtual functions dated. In 1989, virtual function were slightly slower than direct calls, enough that you would notice in a tight loop. But in 2009 that is not the case.
jmucchiello
Disagree with the virtual functions comment. If you use them in C++ you would have to re-invent them in C. Do you think you're reinvented version in C would be quicker than the compiler generated version in C++?
Martin York
Both languages are still compiled to the same intermediate language in most compilers.
Martin York
@marin york: well, going from a procedural language to an object oriented might change the way we want to implement some things, enabling us to write simpler but slower code? It's a tradeoff. Note that I am not all too hard about virtual functions but everything else that creates additional runtime code (constructors/destructors, dynamic casting etc).
Johannes Rudolph
+18  A: 

The biggest roadblock to optimal code is no longer the language (for properly compiled languages), but rather the programmer.

Milan Ramaiya
:\ If the programmer chooses to learn inferior languages?
Matt Joiner
+5  A: 

Rule of thumb - do not optimize until you know what to optimize. So start with C++ and have some working prototype. Then profile it and rewrite bottle necks in assembly. But as others noted, chosen algorithm will have much greater impact than the language.

BostonLogan
+1  A: 

Function call vs. member function call overhead is unlikely to be the limiting factor, compared to file input and the algorithm itself. C++ iostreams are not necessarily super high speed. C has 'restrict' if you're really optimizing, in C++ it's easier to inline function calls. Overall, C++ offers more options for organizing your code clearly, but if it's not a big program, or you're just going to write it in a similar manner whether it's C or C++, then the portability of C libraries becomes more important.

paul
+8  A: 
int3
I'd bet that if you compare wise use of virtual functions with alternatives that can be done in C (function pointers, switch statements, etc.), you will find that virtual functions perform as well or better than those alternatives.
Fred Larson
Hey thats really the nice post. You told something that added to the question itself!
Xolve
That's a silly comment about virtual funtiuons. If you need them you need them. Implementing the equivalent of virtual functions in C will be just as expensive as the compiler generated ones in C++. The difference will be that the compiler generated ones in C++ will be more likely to be correct and will have 10 years of compiler optimization tricks behind them.
Martin York
Using virtual functions does not necessarily mean worse performance. One has to take into account the time spent in the body of the virtual function. For example, the indirection incurred by performing the vtable lookup may be dominated by what is actually happening in the body of the virtual function.
Void
+1 Martin. Well said.
Void
well, what I meant was that they should not be used <i>carelessly</i> -- e.g. you shouldn't drop the keyword `virtual` into your class definition 'just in case' it would come in useful in the future, without thinking about its implications. I am aware that simulating it yourself is generally worse, though. Will edit answer.
int3
you shouldn't make such a bold statement without ALSO stating the amoung of overhead - say, compared to a normal function call. A rule of thumb is 10% avg, 50% max - but the potentially biggest factor is losign an inline opportunity. So in most cases **virtual doesn't matter**.
peterchen
**Slowdown on virtual function calls is a lie**. Calling virtual function is slower than calling a usual function only by 1-3 pointer additions. For it to matter, your functions should be more simple than addition of 10 integers. Are they???
Pavel Shved
To be fair, you should add in the memory access time for the call, as well as the increased size of the object due to the vptr. If you are copying around a million of those objects then it could matter. Anyway, I never said that it was a _major_ slowdown. But I guess the SO anti-micro-optimization corps couldn't let it pass..
int3
Also, just declaring a method as virtual doesn't mean it's called using a vtable. If the compiler can determine the exact type of a variable because it's on the stack, it shouldn't need to use virtual dispatch.
rlbond
@rlbond: well, in theory.. but I don't think they're very good at it yet. The last time I tried it (virtual function call, but caller function 'knowing' the derived type) gcc didn't eliminate the vtable call, even with -O3.
int3
If a class has a virtual function, its size is increased by several *pointers*, number of which is greater than one only for multiple/virtual inheritance. What size do the object, for which virtual functions are in use, have?
Pavel Shved
@splicer, for interprocedural call of non-inline non-static function the rule doesn't apply. Who knows what will eventually call the function?
Pavel Shved
I don't think virtual call overhead breaks the "zero-overhead" principle. The principle is "you don't pay for what you don't use", not "you don't pay for what you don't need, but are using anyway because you asked for it by accident" ;-) But if it does break the principle (i.e, if the fact that `virtual` is in the language slows down non-virtual calls), then I'd be interested to see an explanation of how.
Steve Jessop
+2  A: 

The Solver class will be constructed once, I take it, and the run method executed once... in that kind of environment, you won't see a difference. Instead, here are things to watch out for:

  • Memory management is hellishly expensive. If you need to do lots of little malloc()s, the operating system will eat your lunch. Make a determined effort to re-use whatever data structures you create if you know you'll be doing the same kind of thing again soon!

  • Instantiating classes generally means... allocating memory! Again, there's practically no cost for instantiating a handful of objects and re-using them. But beware of creating objects only to tear them down and rebuild them soon after!

  • Choose the right flavor of floating point for your architecture, insofar as the problem permits. It's possible that double will end up being faster than float, although it will need more memory. You should experiment to fine-tune this. Ideally, you'll use a #define or typedef to specify the type so you can change it easily in one place.

  • Integer calculations are probably faster than floating point. Depending on the numeric range of your data, you may also consider doing it with integers treated as fixed-point decimals. If you need 3 decimal places, you could use ints and just consider them "milli-somethings". You'll have to remember to shift decimals after division and multiplication... but no big deal. If you use any math functions beyond the basic arithmetic, of course, that would of course kill this possibility.

Carl Smotricz
+1  A: 

first, writing in C++ doesn't imply using OOP, look at the STL algorithms. second, C++ can be even slightly faster at runtime (the compilation times can be terrible compared to C, but that's because modern C++ tends to rely heavily on abstractions that tax the compiler).

edit: alright, see Bjarne Stroustrup's discussion of qsort and std::sort, and the article that FAQ mentions (Learning Standard C++ as a New Language), where he shows that C++-style code can be not only shorter and more readable (because of higher abstractions), but also somewhat faster.

just somebody
While true, your answer needs to be backed up heavily by examples and explanations before having any real merit. For starters, I’d suggest explaining (and linking to) Stroustrup’s comparison between C’s `qsort` and C++’ `sort` routines. (But +1 for the good start)
Konrad Rudolph
+4  A: 

When speaking of performance, anything you can do in C can be done in C++. For example, virtual methods are known to be “slow”, but if it's really a problem, you can still resort to C idioms.

C++ also brings templates, which lead to better performance than using void* for generic programming.

Bastien Léonard
++ If virtual functions are something you actually need, then they work just as fast in either language. They're just a little prettier in C++. Regardless, the code has to be really, really tight before you would notice a difference between virtual and non-virtual functions.
Mike Dunlavey
+1 for template speed comment.
Marcus Lindblom
Slightly OT, but I saw a Scott Meyers talk a few years ago where he showed the use of templates to generate exactly the same code as hand-optimised C (this was in the context of embedded systems, but the same principle still holds).
the_mandrill
+1  A: 

Another aspect:

C++ templates can be an excellent tool to generate type-specific / optimized code variations.

For example, C qsort requires a function call to the comparator, whereas std::sort can inline the functor passed. This can make a significant difference when compare and swap themselves are cheap.

Note that you could generate "custom qsorts" optimized for various types with a barrage of defines or a code generator, or by hand - you could do these optimizations in C, too, but at much higher cost.

(It's not a general weapon, templates help only in sepcific scenarios - usually a single algorithm applied to different data types or with differing small pieces of code injected.)

peterchen
A: 

Good answers. I would put it like this:

  1. Make the algorithm as efficient as possible in terms of its formal structure.

  2. C++ will be as fast as C, except that it will tempt you to do dumb things, like constructing objects that you don't have to, so don't take the bait. Things like STL container classes and iterators may look like the latest-and-greatest thing, but they will kill you in a hotspot.

  3. Even so, single-step it at the disassembly level. You should see it very directly working on your problem. If it is spending lots of cycles getting in and out of routines, try some in-lining (or macros). If it is wandering off into memory allocation and freeing, for much of the time, put a stop to that. If it's got inner loops where the loop overhead is a large percentage, try unrolling the loop.

That's how you can make it about as fast as possible.

Mike Dunlavey
A: 

hi

I would go with C++ definitely. If you are careful about your design and avoid creating heavy objects inside hotspots you should not see any performance difference but the code is going to be much simpler to understand, maintain, and expand.

Use templates and classes judiciously. avoid unnecessary object creation by passing objects by reference. Avoid excessive memory allocation, if needed, allocate memory in advance of hotspots. Use restrict keyword on memory pointers to tell compiler whenever pointers overlap or not.

As far as optimization, pay careful attention to memory alignment. Assuming you are working on Intel processor, you can make use of vector instructions, provided you tell the compiler through pragma's about your memory alignment and aliased pointers. you can also use vector instructions directly via intrinsics.

you can also automatically create hotspot code using templates and let compiler optimize it out if you have things like short loops of different sizes. To find out performance and to drill down to your bottlenecks, Intel vtune or oprofile are extremely helpful.

hope that helps

aaa
A: 

I do some DSP coding, where it still pays off to go to assembly language sometimes. I'd say use C or C++, either one, and be prepared to go to assembly language when you need to, especially to exploit SIMD instructions.

Nosredna