Why are most really fast servers written in C instead of C++?

views:

1663

answers:

+29 Q:

Why are most really fast servers written in C instead of C++?

I'm trying to decide which to learn and I've read all the "Which is better" questions/arguments, so I thought I'd get your take on something more specific.

Is there a platform dependency issue that C++ developers run into with such applications? Or, is it because there are more C developers out there than C++? I also noticed that many more third party C modules exist for Python even thought C++ modules are supported. From what I've read on different threads the consensus is that C++ is easier and faster to write, and runs just as fast. Am I missing something really big.

Examples:

NGINX APE (comet server) Apache

+11 A:

This is my guess, as such nothing taken from any source. C has been around a long time. Programmers have been around a long time. The most experienced programmers are probably old. Maybe some of the really fast servers have been around a long time too? I am just thinking age + refactoring + experience (over time) = fast servers :). I am thinking some products started out when C was "the hot language" and have been maintained since.

Any product that gets popular benefits from maintainance to keep/grow popularity, I guess.

I don't think it's something that happens over night.

Statement 2010-04-08 06:24:14

+1 A:

I think it's hard to hit a moving target, especially if you're trying to write a program that is fast, simple and cross-platform. C is a smaller, less complex language than C++, and C has not changed very much since in the mid-1990s as the Internet was ramping up. C++ is coming out with C++0X sometime in 2011 supposedly, so the language is still evolving.

James Thompson 2010-04-08 06:26:56

@James, C is still evolving as well. Ever hear of C90? C99?

Michael Aaron Safyan 2010-04-08 06:29:56

@MAS, give the man a break... :) I quote: "... has not changed **very much** since in the **mid-90s** ...", no big contradiction with C89/C90 and C99 AFAICS.

vladr 2010-04-08 06:34:39

sbi 2010-04-08 07:47:53

C++ has not changed very much either. It was standardized in 1998, and is still essentially the same today. (with a couple of minor bugfixes in 2003). How is that more of a moving target than C? C99 was a much more dramatic change than anything C++ has had *so far* (Of course, C++0x is going to change that, but for existing servers, that has hardly been a concern.)

jalf 2010-04-08 11:22:26

+11 A:

I strongly suspect that this has more to do with a lack of C++ APIs for network I/O at the time they were written than anything else; the socket(), bind(), listen(), accept(), select(), read(), and write() system calls all have C interfaces. While there are now a number of C++ APIs for performing network I/O (e.g. Boost.Asio), these are fairly new to my understanding, and ultimately they simply wrap the C language system calls.

Michael Aaron Safyan 2010-04-08 06:27:19

If there is a C interface, there is a C++ API. Claiming that these C APIs are somehow unavailable to C++ programmers is kind of ridiculous.

jemfinch 2010-04-08 12:51:51

@jemfinch, that was not what I meant. What I meant was that there was not an *idiomatic* C++ API (e.g. using OOP, classes, etc.). Yes, obviously you can call these functions from C++ with no problem.

Michael Aaron Safyan 2010-04-08 18:51:58

@Michael Given the state of C++ in the wild, claiming that any given API is "idiomatic" is a stretch :-P

jemfinch 2010-04-08 18:59:37

@jemfinch, fair enough... "idiomatic" is the wrong word... maybe "making use of C++ features that are not available in plain C" would be a more accurate and objective phrasing.

Michael Aaron Safyan 2010-04-08 23:44:21

I can't argue with that :) One of the benefits of C++, though, is that you can easily and relatively transparently wrap C libraries in C++ idiom.

jemfinch 2010-04-09 00:22:47

+5 A:

Software gets faster when it's fine-tuned, and the more boxes software runs on the more fine-tuning gets performed.
historically, it's been harder to write portable C++ than C, so widely-used servers were written in C for portability.
so, those were the software that got more tuning effort.

Mark Harrison 2010-04-08 06:39:18

I disagree. 80% of the software will run on 20% of the platforms, so as long as those are supported supporting more won't help much.

Andreas Bonini 2010-04-08 07:12:38

Andreas, you are probably too young to have been around when a lot of this software was written. C++ at the time was much less standardized. Many compilers could not compile the STL, different vendors shipped with different container libraries, you could not link g++ generated objects with HP or Sun compiled objects, etc. It really was a lot easier to just code the things in C.

Mark Harrison 2010-04-08 07:47:29

@Mark Harrison - Yep I remember those days. C++ had a messy start which I think lead to a lot of people sticking with C as the compiler and lib support was more mature ( many people still do )

zebrabox 2010-04-08 08:02:31

`Software gets faster` when its 'as-close-to-hardware' as possible; and C is the best bet.

KMan 2010-04-08 08:16:09

If you're shipping binaries that have to run on multiple versions of a platform, it's still easier to make that work with C than C++. I think this is because with C it's much more usual to not put functional code or internal structure layout in header files. Especially public headers.

Donal Fellows 2010-04-08 08:23:38

@Donal, that is completely untrue. Boost and Qt do a much better job of being cross-platform than anything you'll find in C. The Apache Portable Runtime (APR) does a good job, but is less comprehensive. In C, there is much more temptation to rely on POSIX functions that, for example, don't work or don't exist on Windows.

Michael Aaron Safyan 2010-04-08 23:49:52

+4 A:

I strongly belive, many C programs should run identical speed when translated with either the C and with the C++ compiler , this is because C programs are valid C++ programs when the features of either not considered.

Although when the features of C++ are to be considered here, C++ only makes simpler and less error prone.

I agree with @Statement statement that , it could be because C has been since long and most people very much comfortable writing C programs.

I myself started with C and moving to C++ programs , I feel C++ provides no significant downsides and a number of significant improvements making your program easy to maintain with less error prone. But I do feel at times there is many concepts to learn to master C++ and I have to run here and there to find best solution [This is one place off course], whilst working on C I hardly referred any document or ask people for help.

dicaprio 2010-04-08 06:49:19

Gentlemen !! may I know the reason for downvote ? did it sound absurd to anyone ?

dicaprio 2010-04-08 06:56:55

@dicaprio, you can find the reason to downvote in my comment to the question.

Pavel Shved 2010-04-08 06:59:09

@Pavel : I just saw that , it is because my answer started with the word "Guess" :) but yet I have concrete examples with me which I experienced myself. This link I found have some good examples and explanation. http://unthought.net/c++/c_vs_c++.html

dicaprio 2010-04-08 07:04:45

@dicaprio, of course, not, I don't judge answers by one word. However, your answer did say *nothing* about web serving. Hence I think you'r just guessing.

Pavel Shved 2010-04-08 07:14:02

@dicaprio Not all C programs are valid C++ programs. This is a common misconception, there are quite a few differences. Variable length arrays being the most commonly citied.

Simon 2010-04-08 13:02:26

@Simon: I agree with you and that is why i mentioned "many" and "feature not be considered". It is unfair to downvote though.

dicaprio 2010-04-09 05:17:04

+32 A:

Edit: Due to the extremely subjective nature of comparing programming languages, in particular the open hostility many in the C and C++ camps have towards each other, I would like to make it absolutely clear that I am attempting to explain a point of view rather than champion one language over the other.

Fast servers try to minimize the amount of copying done. This means that any strings you parse out of an IO buffer are represented using pointers. Using std::string would mandate copying the string, which is slow. You could create a new string class which doesn't copy its contents, but this string class would be no better than a C structure. Edit: Sometimes, modifying code to use C++ structures will require either additional copying, additional indirection, or writing the code in the same manner as the C code. For example, std::list<T> will copy any T that you add to it, while std::list<T*> requires an additional object for each member and has an additional layer of indirection. The least expensive option might be to put next and prev fields inside the T structure itself. I think Boost has a library for handling this case, but linked lists are second nature to experienced C programmers so there's not a huge benefit to the Boost option and some programmers stylistically prefer handling linked lists themselves. Some of the people commenting below suggested that passing a std::string& or my_string& is as cheap as passing around pointers to a buffer — this is not entirely true, since in order to access the buffer one now has to dereference two pointers, where the C code would only have to dereference one. The differences vanish if my_string is passed by value, as long as it doesn't do anything too fancy in its copy constructor... however, such an object would no longer be any safer than the equivalent C construct (since in this case, my_string can now outlast the buffer to which it points).
Fast servers try to minimize the amount of allocation done. So while a programmers are writing fast servers, it helps if none of the allocations and deallocations are hidden from view. This means that all of those helpful ways to make allocations unobtrusive in C++ get in the way. Edit: A smart C++ programmer will be very aware of where the code performs allocations. Nonetheless, a C programmer doesn't have to think quite so hard about it.
Fast servers need to work with syscalls very closely, and keep data in the format that the syscalls expect whenever possible. Syscalls have a C interface. Edit: It's very easy to call them from C++, too, but wrapping all of the objects in C++ classes is a lot of work for not much benefit.
Exception handling causes code size to increase. This decreases the locality of the program code and slows it down, even if no exceptions are being thrown. Edit: I am aware that this is a negligible (but measurable) difference in speed. I've heard the difference in executable size is something like 5-15%, which matches the results I got using quick tests with and without -fno-exceptions using g++. I'd say it's more relevant that some of these servers use a somewhat unusual error handling technique: if code is handling a request and there's an error, you can just jump out to an outermost error handling routine and let it handle cleanup. If you're using memory pools, such as Apache's apr_pool_t, then there's no need to call destructors as you unwind the stack and therefore the advantages of C++ exceptions over longjmp are somewhat few.
Template instantiations cause code size to skyrocket. This also decreases the locality of the program code. Edit: The instantiations are exactly what give C++ templates their speed advantage over near-equivalent C code. For example, if you want to sort an array of X, you can get a fully inlined, rock-solid sorting function in C++ using templates in a single line of code. In C, you would have to write it yourself. But these servers do not use very many different kinds of data structures, often they only need some simple queues and an associative map from strings to void*.
It is relatively easy to estimate how long it will take a particular piece of C code to run, it is more difficult to estimate the run time for C++ code. There are a lot of extra things that can happen in C++ code: temporary objects can be created to pass between functions, operators can be overloaded to perform complex tasks, and exiting a scope can cause expensive destructors to run. I've seen C++ programs hang in poorly written destructors. On the other hand, C maps very cleanly to machine code, with relatively few surprises.
(Edit: new) There's also the issue of the performance compiling a given piece of code using a C compiler versus compiling it with a C++ compiler. Due to the stricter aliasing rules of C++, the performance benefit often goes to the code compiled with the C++ compiler. However, C99 has introduced the new restrict keyword which is not yet available in C++. It allows the user to declare that there are no active aliases to a given region of memory. This allows the compiler to perform some types of optimizations previously only available to Fortran programmers (as Fortran has stricter aliasing rules than C++).

So, if you were trying to write a very high performance server in C++, you'd probably not use the STL, exceptions, templates (or sparingly), sophisticated constructors or destructors, or operator overloading. You'd also probably not wrap the C structures you use. What are you left with? You're left with 90% C and 10% C++. At this point, you say, "why bother?" and write the whole thing in C. There are added benefits to using C:

C compilers are available in places where C++ compilers are not. On some platforms, C++ compilers are available but not as mature as the C compiler. Conditions were worse in 1995, which was when Apache was written.
If you write your app in C, people can write glue to call into their favorite dynamic language (Python, Ruby, Perl) without much trouble. It is easier to interface with those languages if your application is written in C.

I think the biggest part of it all is the syscalls. When designing a fast server, you basically plan how the bytes travel from one syscall to another syscall as quickly as possible. I would guess that high performance non-server apps that do more number crunching are more likely to be written in C++, for example, many fast AI libraries seem to be written in C++.

Edit: There's also a sampling error here. The projects listed are open-source projects only, and certain parts of the open-source community, especially the systems programming part, use C. Part of this is cultural, as perhaps the most visible open-source project is the Linux kernel, which is written in C. Linus Torvalds in particular is noted for his rants against C++, you can also look at the Linux kernel mailing list FAQ with a section on why Linux does not support C++ in the kernel. The Linux kernel has some even better reasons not to use C++, especially that exception handling in kernel code is generally regarded as a very bad idea (Apple's IOKit, which requires you to write drivers in C++, disables exception handling). I would say that C is much less common outside of the open-source community.

As for Python modules, most of the ones written in C are just interfaces to some other library (and a large portion of these modules are written in Python itself, natch). Most of the C code is spent converting between the formats that the library expects and formats that Python can understand. C++ doesn't deliver any real benefits for this kind of glue code. Fortunately, the glue code is very small (I wouldn't want to write very much of it).

As for a recommendation about which to learn: The nicest thing about programs like nginx, varnish, apache, git, subversion, python, etc. is that they're already written and already ready for a production environment. I would definitely not recommend either C or C++ over the other unless I knew what kind of applications you wanted to write, as I'm guessing you're not going to write the next web server (we already have a good selection).

Dietrich Epp 2010-04-08 07:20:56

At last! An answer that actually answers the question!

Pavel Shved 2010-04-08 07:25:12

sbi 2010-04-08 07:43:14

[...] I've been part of a several MLoC cross-platform C/C++ project (you're likely to have it installed on your machine) and IME the only reason to prefer C over C++ for large applications is that _it secures your job to build buffer overruns into your code_. Writing high-level C++ I checked in ~3 AV's in a decade, and my code was not generally slower than the buggy C code some of the others checked in. (Yes, we found a very obvious correlation between coding style and crashing bugs. That led to a lot of pressure to those adverse to high-level C++ and converted some of them.)

sbi 2010-04-08 07:44:53

Dietrich Epp 2010-04-08 08:00:45

@Dietrich: If you don't have an immutable string class, but pass around `std::string` everywhere, you could be right. But for a project as big as Apache, there's no reason to not to write a good immutable string class for that. OTOH, if I want to write _fast code_, I wouldn't _avoid_ the STL, templates, exceptions etc., but I'd use them _as much as possible_. That's the opposite of what you wrote and has nothing to do with the application domain. I strongly disagree with your post. (But I guess I made that clear already. `:)`)

sbi 2010-04-08 08:06:59

I am honestly somewhat surprised at the negative reaction to this answer, and the lack of responses in the comments. Anyone who downvoted care to explain why?

Dietrich Epp 2010-04-08 08:08:08

@Dietrich: I think I explained it well enough and I usually consider up-votes on comments explaining down-votes as "me too!" statements.

sbi 2010-04-08 08:11:28

The IO library should return objects already, and probably as a contiguous type like an std::vector or a QByteArray from Qt (QByteArray can also be initialised fromRawData without having to make a copy).

CiscoIPPhone 2010-04-08 08:20:04

@CiscoIPPhone, as far as I get it, IO library **used in servers** is `read`/`write` syscalls, not some Qt/STL sugar.

Pavel Shved 2010-04-08 08:30:32

The reason I used the QByteArray as an example is because I use it in a server I wrote. Whether the C++ network libraries I used ultimately wrap read / write syscalls is not important.

CiscoIPPhone 2010-04-08 08:52:40

@sbi: For *speed*, you need to control exactly when memory allocations and buffer copies happen, preferably minimizing each. Those costly things are explicit in C (without macro madness!) so devs focus on them. C++ can hide the costs at the source level more easily, so it's easier to get less efficient code in critical places. (Safety concerns are either orthogonal or even antagonistic to speed. Writing a complex application purely in C would scare me a lot these days simply because getting it right everywhere is so hard. But then I prefer a more managed language than C *or* C++.)

Donal Fellows 2010-04-08 10:49:22

@Donal: _When_ memory allocations/deallocations happen is just as well defined in C++ as it is in C. I agree that the higher abstraction level might hide some of these better than C does. As I wrote in my answer (http://stackoverflow.com/questions/2597944/2598337#2598337), in the end, I think it boils down to the proficiency of the person writing the code. If you know what to do, C++ can be just as fast or even faster (there have been findings where virtual functions were faster than switched over enums) than C. If you come from Java, know little C++, and mindlessly copy strings, that's bad.

sbi 2010-04-08 10:58:47

@Donal: A half decent C++ programmers knows exactly when allocations/deallocations take place. Most of your argument apply only to "C programmers who are incompetent at C++". And then *that* is the (very valid, of course) reason they prefer C.

jalf 2010-04-08 11:15:48

"Exception handling causes code size to skyrocket." What's the evidence for this claim? Can you provide examples?

jemfinch 2010-04-08 12:59:27

Donal Fellows 2010-04-08 14:16:21

@Donal, your post seems like a rant against C++ that is not based in any actual facts. Also, in terms of your premise that it is "a heck of a lot less obvious than doing it in C", that is IMHO, the opposite; C++ provides *abstractions*, and abstractions allow you to actually reason about the things that are important (like handling a request) rather than the things that are just details (like do I need to call read() another time, because it didn't fill up my buffer). Also, in C, you still need to deal with allocations (malloc) but you don't have RAII to make it clean and automatic.

Michael Aaron Safyan 2010-04-08 23:54:16

Control of whether to read() actually helps with avoiding problems caused by malicious clients (e.g., that either send data very slowly or in large quantities). Abstracting to the message level makes it very easy to lose control of such things and make the app vulnerable to DoS attacks.

Donal Fellows 2010-04-09 07:52:14

I don't care for RAII so much, but that's because I think it doesn't go far enough! I'd much rather have explicit syntax that makes it clear what the scope is directly rather than relying on an enclosing context. Python's `with` does this *much* better (Ruby and Lua have some good ideas too). Yes, I understand why C++ doesn't do it – it would have been a massive syntactic change to the language – but it does make code written in C++ much harder to read than it ought to be; you shouldn't have to load it into an IDE in order to understand it.

Donal Fellows 2010-04-09 08:02:57

@Michael: Overall, abstractions are great *but* aren't panaceas. In particular, they make you more remote from the operations performed at the low level. That's usually a good thing (e.g., it makes it easier to enforce safety) but for strict performance purposes they can hurt. It's too easy to write code that has bad behavior and which looks almost identical to good code (the subtleness of copy constructors are a particular issue in my eyes).

Donal Fellows 2010-04-09 08:14:38

"Using std::string would mandate copying the string, which is slow." -- not right, most of STL implementations has copy-on-write strings which are very cheep to copy and thread safe. This is something very useful for reduction of memory copying.

Artyom 2010-04-10 08:54:03

+5 A:

To me what you're talking about seems to be Open Source servers, not servers in general. I have seen servers written in C++, but this was in the industry, not in the Open Source community. I think C++ is much more used in the industry than it is in the Open Source community.

IME a very important reason why so much of the Open Source software is written in C instead of C++ is that some of their most glorious figures don't like C++. An often cited example is Linus Torvalds dissing C++. While this has often been well countered (see here for a nice example), it nevertheless has a great influence.

Another reason is that C compilers are much easier to write than C++ compilers (and we're probably talking orders of magnitude here), which is why C is indeed somewhat more portable than C. However, GCC's abundance has rendered systems where C is available, but C++ isn't, a niche.

Finally, I believe that, if you know too little about both languages, while it is much easier to write safer code in C++, it is easier to write fast code in C. (And don't you dare to ever cite this without the highlighted disclaimer!) However, if you're an experienced C++ programmer, it is much easier to write fast and bug free code in C++ than it is in C.

sbi 2010-04-08 08:01:20

You stick to your beliefs matey! If they comfort you! I'll stick to software engineering; activities such as thinking, research, analysis, specification, design, testing and MEASUREMENT.And good, professional management to boot.PS. I did not down vote.

Sam 2010-04-08 08:26:08

@sbi you're answering generic "C vs C++" question. Did you overlook the word "servers" in the caption? (-1)

Pavel Shved 2010-04-08 08:27:50

@Pavel: No I didn't. I answered why _OSS_ is usually written in C++. I've seen servers written in C++, but only in the industry, not in OSS. This is what I answered, but I guess I could make it clearer.

sbi 2010-04-08 08:43:32

@Sam: I'm not sure from which part of my answer you concluded I wouldn't measure before arguing.

sbi 2010-04-08 08:44:23

Agreed, in my experience more servers are written in c++ than in c, it's just common open source servers that tend to be written in c. Why *that* is, I'm not certain.

John Burton 2010-04-08 09:14:45

@Pavel: How do you figure normal C/C++ characteristics do not apply to server code? A fast C program is a fast C program, whether it's run on a web server or a desktop or a cell phone. This answers the question perfectly well. The "really fast servers" the OP listed are not written in C out of performance concernes, but for the reasons listed by @sbi. The question is *not* "why is C faster than C++ in servers"

jalf 2010-04-08 11:20:44

@jalf: How do I know that server code doesn't have any specific characteristic, that renders generic C/C++ performance changes too small to consider? That's the question! And I upvoted (tried to) sbi's answer after his edit, btw.

Pavel Shved 2010-04-08 11:29:37

@Pavel: no, the question is exactly what it says in the title: "why are many fast servers written in C". It is not clear that they are in fact written in C *because C is faster*. Causation, correlation. :)It is perfectly possible (and likely) that said servers are written in C for reasons that have little to do with performance, and everything to do with age, familiarity with the languages and social factors (like Linus hating C++ with a vengeance)

jalf 2010-04-08 12:06:20

@sbi: What makes you come to the conclusion that I came to a conclusion that you wouldn't 'measure before arguing', whatever that means? All I know (infer?), from your last paragraph is that you 'believe' it is much easier to write safer code in C++ ....Belief has no place in a technical profession.

Sam 2010-04-08 12:27:49

@Sam: We're not talking hard facts here. Whether it's "easier" to write a program in language A vs. language B isn't something that's easily measurable. Unless some scientists do a study with a representative batch of participants and find convincingly correlating results, we're stuck with what our experiences make us believe. I took this for given, so I did't see how your comment could apply to the language question. And I still don't see how you can apply "analysis, specification, design" to the question which language is easier to write server code in. What's that got to do with it?

sbi 2010-04-08 12:46:13

@sbi: The main point I was attempting to make was that I use software engineering / reasoning than just using the words 'I believe something'. I did not use the 'analysis, spec ...' for the purpose answering the OP's question. I cannot see how you came to that conclusion. It was specific to your last paragraph.Experiences do not make us believe. I believe I write bug free code first time but my own experience, and others tell me too, that I do not. I can choose to accept that fact and stop believing or I ignore those experiences and keep on believing.Enough, let's have a beer.

Sam 2010-04-08 13:21:55

@Ben OH BUGGERS! I'm so sorry >< I completely misread the comment. Bah... Sorry about that. I'll remove it >< Sorry!And for the record, I didn't downvote the answer.

blwy10 2010-04-09 16:23:19

+2 A:

Why are most really fast servers written in C instead of C++?

What do you mean by server?

I have no understanding of what you mean by 'really fast'?

Can you quantify 'most'?

How do you know that 'most really fast servers' are written in C?

What criteria did you have in mind when formulating this question?

A server's networking code may be written in C but it's primary function of accessing a database may not be written in C.

So many hidden assumptions here.

I feel like I'm trying to elicit formal requirements from a customer. -:)

I predict that I'll get down voted for being pedantic!

Sam 2010-04-08 08:11:14

Doesn't these sound like they should be in comments rather than in the answer section?

Jay 2010-04-08 08:36:13

@Jay: Did think about this. But, as Potatocorn has suggested, the question is subjective. I wanted to give some sense of why this question can be seen as subjective and to give prominence to questions that uncover the question's subjectivity.

Sam 2010-04-08 12:40:58

@Sam, you're more likely to be voted down because your answer isn't an answer.

Mark Harrison 2010-04-08 15:50:03

There is no appreciable difference in speed from a good C++ coder Vs a good C coder, unless you are writing bank trading software or something where every 0.001s in executing a trade can get extra money.

Both languages are very fast and even C++ is considered low-level and not optimal for writing most software... typically the decision would be between C++/Java or C++/C#, not C/C++.

C is good to learn, but C++ is likely to be far more useful to you... and if you know C++ you can quite easily work on pure C code if you ever need to.

C is historically used a lot in the Linux world, partly because people like Torvalds (Linux creator) don't personally like C++ and are hugely influential. It's mostly a social effect - by contrast on Windows C++ is much more popular.

Considering C++ is being viewed as old-fashioned, and C is slowly being replaced even in embedded systems by C++, C++ is clearly the right choice in my view. And then learn Java.

John 2010-04-08 08:34:49

+1 A:

Dunno about nginx, and this might not be the answer you're looking for, but you could find a somewhat educating discussion/bit of speculation on why Apache uses C instead of C++ here.

susmits 2010-04-08 12:47:39

Hysterical reasons.

You can write fastest possible code while still expressing abstractions as C++ classes. Also, template metaprogramming techniques enable some performance optimizations not possible in C, such as custom loop unrolling.

Pavel Radzivilovsky 2010-04-10 08:49:30

Hysterical or historical?

Roger Pate 2010-04-19 01:09:46

ansaurus

tags:

views:

answers:

Why are most really fast servers written in C instead of C++?

related questions