views:

713

answers:

13

Working on different projects I have the choice of selecting different programming languages, as long as the task is done.

I was wondering what the real difference is, in terms of performance, between writing a program in Python, versus doing it in C.

The tasks to be done are pretty varied, e.g. sorting textfiles, disk access, network access, textfile parsing.

Is there really a noticeable difference between sorting a textfile using the same algorithm in C versus Python, for example?

And in your experience, given the power of current CPU's (i7), is it really a noticeable difference (Consider that its a program that doesnt bring the system to its knees).

Thanks! :)

+9  A: 

In general IO bound work will depend more on the algorithm then the language. In this case I would go with Python because it will have first class strings and lots of easy to use libraries for manipulating files, etc.

ChaosPandion
+1: Unless it involves many compute-intensive loops, the limiting factors always seem to be OS resources like file systems and process slots and memory.
S.Lott
The simple thing to think about the performance difference: if you are going to process the text character-by-character Python will probably be too slow, but whenever you can do your job with bigger chunks (processed by built-in functions) then Python performance will usually be comparable to that of C, but the code will be much, much simpler.
Jacek Konieczny
@Jacek - Good point. You can also write high performance code in C for portions of your project as @jshen noted in his answer.
ChaosPandion
"(processed by built-in functions) then Python performance will usually be comparable to that of C" ... Oh please. I get what you are saying, but... oh please.
JustBoo
@JustBoo: Careful of bashing Python. The list sort in Python is amazingly fast. The built-in functions are -- in many cases -- syntactic sugar over the C library. If Python is used wisely, it isn't inherently slow. Of course, bad choice of algorithm can make any language horribly slow.
S.Lott
+6  A: 

Is there really a noticeable difference between sorting a textfile using the same algorithm in C versus Python, for example?

Yes.

The noticeable differences are these

  1. There's much less Python code.

  2. The Python code is much easier to read.

  3. Python supports really nice unit testing, so the Python code tends to be higher quality.

  4. You can write the Python code more quickly, since there are fewer quirky language features. No preprocessor, for example, really saves a lot of hacking around. Super-experience C programmers hardly notice it. But all that #include sandwich stuff and making the .h files correct is remarkably time-consuming.

  5. Python can be easier to package and deploy, since you don't need a big fancy make script to do a build.

S.Lott
2. Opinion. Some people hate the whitespace-based code. 3. Nothing to do with Python, C has unit testing libraries. 4. Header files aren't really time consuming if you have a good editor. 5. No. There is nothing simpler than C for packaging becaue Makefiles are very simple.
mathepic
Up until I started messing around with the *Quake 3* source code I hadn't realized how much ceremony is involved with complex C projects. *(Previously I had only played around with micro-controllers.)*
ChaosPandion
I forgot: -1 because the entire answer has nothing to do with performance.
mathepic
readability has nothing to do with liking the code. python actually does have more readable code than C, though that might have something to do with the fact that there's less of it.
sreservoir
Readability DOES have to do with being able to clearly see where a function ends. The whole thing is an opinion. Some people like the brief code - Others like the so called repetition that shows your intent.
mathepic
@mathepic: performance? The programmer's performance is often the most expensive part of software development. Are you saying that optimizing the programmer's time has no value?
S.Lott
@mathepic A Makefile isn't the same as a package. And a simple Makefile rarely gets the job done on all platforms.
Michael Mior
This question is not about the performance of the programmer. It is about the performance of the program. Therefore, this answer is off topic and is simply evangalizing Python over C.
mathepic
@mathepic - You are really borderline trolling here. With some work programmers can be productive in any language. Python simply takes less effort to be productive in.
ChaosPandion
@mathepic: "question is not about ... It is about ..." Really? What evidence do you have for this assertion?
S.Lott
"given the power of current CPU's (i7), is it really a noticeable difference". It seems reasonably unlikely to me that the questioner is asking about programmer time, he's asking about runtime. He probably *should* be asking about programmer time, though, so this is a relevant answer even if the questioner doesn't know it yet. IMO though if you're guaranteeing that the Python code will not run noticeably slower than the C code, it would be best to do so explicitly rather than by omission ("the noticeable differences are..."). Despite the rhetorical power of that omission.
Steve Jessop
@Steve Jessop: They're not asking about IDE performance with that "power of current CPU's (i7)" business?
S.Lott
@Steve - Yes, the original question may have been with regards to performance but sometimes we can help the questioner more by shifting the direction of the question to something more relevant.
ChaosPandion
@S.Lott: Questioner doesn't mention IDEs, that I can see. Are you saying you think they do mean an IDE, or saying that you don't know whether they do or not? I think the balance of probability is that they are not, because people who throw around words like "in terms of performance" and "CPU power", without being precise, are almost invariably thinking about how fast their code will run, not how long it will take them to write it. Assuming that the questioner doesn't have an i7 for a brain, that is.
Steve Jessop
@Steve Jessop: "... don't know whether they do or not". Correct. "people who throw around words like... are almost invariably thinking about how fast their code will run" Generally true. "sometimes we can help the questioner more by shifting the direction of the question". My point precisely.
S.Lott
+1 for mentioning the hackiness of C. Every C program contains a massive build script and is usually executed by hundreds of thousands of lines to check if your C compiler and bash crap have certain features. @mathepic "2. Opinion. Some people hate the whitespace-based code." Anyone who thinks Python is less readable than C either A) doesn't know C B) doesn't know Python C) Is just another bad coder who writes bad code. Please, show me a good 1M LOC project in a C based syntax language without indentation, I'd really like to see that.
Longpoke
@Longpoke: As I said, its an OPINION. I don't dislike indentation - I dislike syntactic indentation because it means its impossible to automatically indent correctly. As for the hunders of thousands of lines, its automatically generated and not that big. (I haven't even stated that the python opinion is wrong, yet I'm being insulted by these people for stating that their opinions aren't always the global opinion...)
mathepic
"impossible to automatically indent correctly"? What does that mean? Impossible? Who -- or what -- is automatically indenting? An IDE? Mine all work perfectly. What are you saying is "impossible" here?
S.Lott
"I dislike syntactic indentation because it means its impossible to automatically indent correctly." No it isn't, I do it every day. The fact is that Python is easier to read, it doesn't matter if one thinks it looks pretty or not, because it's easier to read and maintain, which is all that matters. Please don't BS me and tell me it's easier to read `int reg_cb(Thing** (*cb)(int**, int, int), int x, int y) {...}` than `def reg_cb(cb, x, y): ...`.
Longpoke
BTW Ada was invented because C was deemed unfeasibly hard to maintain and ensure safety for large scale mission critical DoD operations. Maybe you should go argue with DoD if you think C is more readable than <insert sane modern language here>.
Longpoke
+23  A: 

Use python until you have a performance problem. If you ever have one figure out what the problem is (often it isn't what you would have guessed up front). Then solve that specific performance problem which will likely be an algorithm or data structure change. In the rare case that your problem really needs C then you can write just that portion in C and use it from your python code.

jshen
+1: Get things working first. Then optimize.
S.Lott
Look at the compiled Cython language before you write any C. Cython compiles to shared libraries that can be directly imported into Python.
Eike
+3  A: 

The first rule of computer performance questions: Your mileage will vary. If small performance differences are important to you, the only way you will get valid information is to test with your configuration, your data, and your benchmark. "Small" here is, say, a factor of two or so.

The second rule of computer performance questions: For most applications, performance doesn't matter -- the easiest way to write the app gives adequate performance, even when the problem scales. If that is the case (and it is usually the case) don't worry about performance.

That said:

  • C compiles down to machine executable and thus has the potential to execute as at least as fast as any other language
  • Python is generally interpreted and thus may take more CPU than a compiled language
  • Very few applications are "CPU bound." I/O (to disk, display, or memory) is not greatly affected by compiled vs interpreted considerations and frequently is a major part of computer time spent on an application
  • Python works at a higher level of abstraction than C, so your development and debugging time may be shorter

My advice: Develop in the language you find the easiest with which to work. Get your program working, then check for adequate performance. If, as usual, performance is adequate, you're done. If not, profile your specific app to find out what is taking longer than expected or tolerable. See if and how you can fix that part of the app, and repeat as necessary.

Yes, sometimes you might need to abandon work and start over to get the performance you need. But having a working (albeit slow) version of the app will be a big help in making progress. When you do reach and conquer that performance goal you'll be answering performance questions in SO rather than asking them.

mpez0
Its easier to search a linked list than a dynamically allocated block of memory, but should I use a linked list for a search?
mathepic
@mathepic: You might want to use a tree for a search. Or you might want to use a hashmap. I'm not sure what you're getting at with your comment.
S.Lott
@mathepic: both C and Python support linked lists and dynamically allocated memory, not to mention other techniques. Since the question asks for C vs Python, I don't get the purpose of your comment, either.
mpez0
The purpose of my comment was to show how the easiest thing to program is not always the best way to do it.
mathepic
+4  A: 

If your text files that you are sorting and parsing are large, use C. If they aren't, it doesn't matter. You can write poor code in any language though. I have seen simple code in C for calculating areas of triangles run 10x slower than other C code, because of poor memory management, use of structures, pointers, etc.

Your I/O algorithm should be independent of your compute algorithm. If this is the case, then using C for the compute algorithm can be much faster.

Derek
+1 A sane answer.
Andrei Ciobanu
-1 This is nonsense. If the file is large, this computation is I/O bound, and thus wont be any faster in C due to context switching and cache coherency issues. If we are talking memory overhead, both C and Python are perfectly capable of reading and processing **chunks** of the file at once.
Longpoke
+1 to counteract nonsense -1. @Longpoke Just because the I/O is a bottleneck doesn't mean that the processing code can't be a second bottleneck.
mathepic
@mathepic The processing code footprint is asymptotically insignificant compared to the I/O of reading a giant file. Please go learn db4o or something (a database written in Java which is faster than RDBMS which are written in C/C++). Or at least go learn what cache coherency means... As soon as you mess with any decent I/O, all these illusionary advantages that C has over other languages are flat out __destroyed__.
Longpoke
@Longpoke You are assuming that the processing code is not extremely complex. Yes, the I/O part will perform the same, but the processing code still has to do complex things.
mathepic
Any decently designed software that is working on an out of core dataset is going to have the I/O be separate from the computational portion. Tuning the application to have the appropriate amount of I/O overlapping the computational portion will keep the processor "fed" with data. In flat memory design, unstructured C software, your software will be faster than just about any other language with the exception of FORTRAN.
Derek
@mathepic: Yes, C may be faster if it does heavy processing on the read data, but that's only really against Python. Comparing C to another statically typed language wont really be of much difference because the code is almost exactly the same (unless you do some hacks in C with pointers etc). In any case, you could have still wrote the entire system in Python and got an unnoticeable speed difference. Perhaps the processing part may be slow (unlikely), in that case you can just write the processing part in C/Ada/FORTRAN/Go/D/etc and the rest in Python or another HLL.
Longpoke
+8  A: 

C will absolutely crush Python in almost any performance category, but C is far more difficult to write and maintain and high performance isn't always worth the trade off of increased time and difficulty in development.

You say you're doing things like text file processing, but what you omit is how much text file processing you're doing. If you're processing 10 million files an hour, you might benefit from writing it in C. But if you're processing 100 files an hour, why not use python? Do you really need to be able to process a text file in 10ms vs 50ms? If you're planning for the future, ask yourself, "Is this something I can just throw more hardware at later?"

Writing solid code in C is hard. Be sure you can justify that investment in effort.

lazyconfabulator
"C will absolutely crush Python in almost any performance category" Perhaps... until you actually want to write a real application, which needs higher level constructs such as hashtables, sets, higher order functions, concurrency, etc, which are already built in Python.
Longpoke
+1  A: 

It really depends a lot on what your doing and if the algorithm in question is available in Python via a natively compiled library. If it is, then I believe you'll be looking at performance numbers close enough that Python is most likely your answer -- assuming it's your preferred language. If you must implement the algorithm yourself, depending on the amount of logic required and the size of your data set, C/C++ may be the better option. It's hard to provide a less nebulous answer without more information.

tanis
+2  A: 

(Assumption - The question implies that the author is familiar with C but not Python, therefore I will base my answer with that in mind.)

I was wondering what the real difference is, in terms of performance, between writing a program in Python, versus doing it in C.

C will almost certainly be faster unless it is implemented poorly, but the real questions are:

  • What are the development implications (development time, maintenance, etc.) for either implementation?
  • Is the performance benefit significant?

Learning Python can take some time, but there are Python modules that can greatly speed development time. For example, the csv module in Python makes reading and writing csv easy. Also, Python strings, arrays, maps, and other objects make it more flexible than plain C and more elegant, in my opinion, than the equivalent C++. Some things like network access may be much quicker to develop in Python as well.

However, it may take time to learn how to program Python well enough to accomplish your task. Since you are concerned with performance, I suggest trying a simple task, such as sorting a text file, in both C and Python. That will give you a better baseline on both languages in terms of performance, development time, and possibly maintenance.

Ryan
and make sure that you run your python code past an experienced python developer. C is not the only language with room for poor programming to drastically increase both development time and running time.
aaronasterling
A: 

Across all programs, it isn't really possible to say whether things will be quicker or slower on average in Python or C.

For the programs that I've implemented in both languages, using similar algorithms, I've seen no improvement (and sometimes a performance degradation) for string- and IO-heavy code, when reimplementing python code in C. The execution time is dominated by allocation and manipulation of strings (which functionality python implements very efficiently) and waiting for IO operations (which incurs the same overhead in either language), so the extra overhead of python makes very little difference.

But for programs that do even simple operations on image files, say (images being large enough for processing time to be noticeable compared to IO), C is enormously quicker. For this sort of task the bulk of the time running the python code is spent doing Python Stuff, and this dwarfs the time spent on the underlying operations (multiply, add, compare, etc.). When reimplemented as C, the bureaucracy goes away, the computer spends its time doing real honest work, and for that reason the thing runs much quicker.

It's not uncommon for the python code to run in (say) 5 seconds where the C code runs in (say) 0.05. So that's a 100x increase -- but in absolute terms, this is not so big a deal. It takes so much less longer to write python code than it does to write C code that your program would have to be run some huge number of times to turn a time profit. I often reimplement in C, for various reasons, but if you don't have this requirement then it's probably not worth bothering. You won't get that part of your life back, and next year computers will be quicker.

brone
A: 

Actually you can solve most of your tasks efficiently with python.

You just should know which tools to use. For text processing there is brilliant package from Egenix guys - http://www.egenix.com/products/python/mxBase/mxTextTools/. I was able to create very efficient parsers with it in python, since all the heavy lifting is done by native code.

Same approach goes for any other problem - if you have performance problems, get a C/C++ library with Python interface which implements whatever bottleneck you got efficiently.

Daniel Kluev
A: 

You will find C is much slower. Your developers will have to keep track of memory allocation, and use libraries (such as glib) to handle simple things such as dictionaries, or lists, which python has built-in.

Moreover, when an error occurs, your C program will typically just crash, which means you'll need to get the error to happen in a debugger. Python would give you a stack trace (typically).

Your code will be bigger, which means it will contain more bugs. So not only will it take longer to write, it will take longer to debug, and will ship with more bugs. This means that customers will notice the bugs more often.

So your developers will spend longer fixing old bugs and thus new features will get done more slowly.

In the mean-time, your competitors will be using a sensible programming language and their products will be increasing in features and usability, rapidly yours will look bad. Your customers will leave and you'll go out of business.

MarkR
-1. "your C program will typically just crash" you can write memory dumps on crash and debug them later.
SigTerm
+1  A: 

To get an idea of the raw difference in speed, check out the Computer Languages Benchmark Game.

Then you have to decide whether that difference matters to you.

Personally, I ended up deciding that it did, but most of the time instead of using C, I ended up using other higher-level languages. Personally I mostly use Scala, but Haskell and C# and Java each have their advantages also.

Rex Kerr
A: 

The excess time to write the code in C compared to Python will be exponentially greater than the difference between C and Python execution speed.

Longpoke