tags:

views:

91

answers:

3

Primary goal is to learn from a popular web server codebase (implemented in C) with priority given to structure/design instead of neat tricks throughout the code.

I didn't include Apache since its code base is an order of magnitude larger than the two mentioned.

+3  A: 

Neat tricks always happen in any codebase worth its salt, to be honest. Nevertheless, the answer you probably don't want to hear is that it would probably be good to study both so you can kind of learn through the intersection. The alternative might really leave you stuck in a box of the "lighthttpd" way or the "nginx" way, etc.

BobbyShaftoe
Good point about the intersection.
droidix
A: 

I didn't include Apache since its code base is an order of magnitude larger than the two mentioned.

Actually Apache code is quite readable. It has large code base because it does lots of things. But it is well structured and quite easy to understand. You can also check APR library (Apache Portable Runtime) which has plethora of small things to learn from.

IMO if you want to learn programming, you should start with lower profile projects - and not HTTPd, but something simpler.

Both nginx and LightHTTPd (just like Apache) are production quality software, meaning very steep learning curve. And the learning unfortunately often means digging archives to see why it is that way - that comes with age to any mature project.

If you are simply into C and learning design, you might want to check the FreeBSD or its derivatives. In my experience it is a better place for starting: there are lots of tools and libraries of all calibers there. And their TODO lists are never empty, what serves well as a guide to where to start.

Dummy00001
HTTP servers are not a bad place to start, and this is because they are based around a simple text protocol. Also one of the most important things in reverse-engineering knowledge from source-bases is goal-oriented motivation. I don't think a blanket statement sending him into a different direction will increase his engineering skills.
Hassan Syed
@Hassan: "simple text protocol" - please check the RFC 1945 which is HTTP/1.0 or RFC 2616 for HTTP/1.1 and tell me where you did find the "simple" there? I did implement HTTP 1.0 (with 1.1 elements) on several occasions and know for a fact that it is not a trivial undertaking to reach the milestone of supporting all mandatory features. With all optional features plus portability, you get rather complicated piece of software. With all performance optimization ... One might need weeks to reverse engineer the logic which led to the creation of the code. And that is a bad way to start.
Dummy00001
@dummy Just because the protocol is simple (to understand) it does not follow that its implementation will be :D I dont deny it takes weeks (it takes months) to get one's head around the undocumented semantics of a piece of system software (web-servers / databases / mail-servers etc). My experience on reversing IIS's ISAPI interface to absolute detail took about 6 months (during development of another complex system), I have been studying nginx for 2 months myself now.
Hassan Syed
@dummy Although you are right, it is the wrong way around, however I personally am starting to think that it is impractical to create useful documentation on the subtleties of such systems (--i.e., reversing is innevitable). I therefore believe that it is an essential skill to master and interest in picking up this rare skill should not be diverted to simpler systems -- Besides, reversing leads one to discover the sub-modules that constitute the simpler systems anyhow, and those may be further studied. When push comes to shove a system's programmer will need to get his hands dirty.
Hassan Syed
A: 

Ngxinx might just be the best straight-c code-base I have encountered. I have read large chunks of Apache, and I always came out feeling unclean, it is a monolithic mess.

You will not just learn about web-servers by exploring Nginx, but pretty much the best practises for writing networked software under Unix and straight-c, from code architecture to meta-programming techniques.

I have heard nothing but good things about Lighttpd, however it is limited in scope compared to Nginx. therefore I would invest time in nginx if I was you. Although lighttpd's limited scope might be beneficial to you, as a first target to study.

Hassan Syed