views:

3946

answers:

11

Im doing a little project on my university which now involves in creating a Webserver only using C. I know a little about the HTTP 1.1 and i've created a webserver in C# before.

However, I'd like to see a nice little tutorial on how you either begin socket programming and using multi threads in C or a complete "How to create a webserver in C"-tutorial.

There's not really a lot of requirements, I'd just like to get started with a little socket, listening to port 80, sending out some respons text to the webserver / telnet-client. And maybe some code-structure advice would be of help too. The biggest requirement is probably that its for Linux.

Any good references are appreciated!

+7  A: 

The fact that is is a web server you want to build in C is less important than the fact that you want to do network socket programming in C. A web server is just a special case where you happen to be returning a certain type of content based on a request, the low-level plumbing is more important.

A good starting guide for Berkeley socket programming can be found here: http://www.uwo.ca/its/doc/courses/notes/socket/ it explains all the C structs and how to use them, and is a great primer for network programming in C or other low-level languages.

alxp
Yes you got an interesting point, however that is why i did state that i know some about the protocol. I might just have been a little miss-leading. But you bring a nice resource to the table, thank you. Do you also have any comments on architectural points when coding sockets in C ?
Filip Ekberg
It depends on what your goals are for the project, if it's just to get a very simple web server up and running in C then I'd stick with using Unix processes, forking off a new process to handle each incoming request, than bother with threads. If you need performance, use an existing program instead.
alxp
+1 for the forking.
Salamander2007
Im not looking for re-inventing the wheel here, this is for a purpose of learning how to program on unix and how to manage sockets. And by speaking of threads, i mean it in a more abstract manner which means "multi-tasking". :)
Filip Ekberg
+1  A: 

I found this link aswell which might be helpfull. http://gnosis.cx/publish/programming/sockets.html

Filip Ekberg
+2  A: 

I'd start with this good socket tutorial, and follow the HTTP 1.1 rfc spec.

friol
+2  A: 

To get some hints as to how a real HTTP server is written in C, you can look at the lighttpd source.

lighttpd (pronounced "lighty") is a web server designed to be secure, fast, standards-compliant, and flexible while being optimized for speed-critical environments. Lighttpd is written in C. Lighttpd is used by some of the biggest websites, including sites such as YouTube, Wikipedia and meebo.

gimel
A: 

If you're targeting linux, I'd go with forking instead of threads to keep it simple.

Kim
A: 

Well if you Download the Platform Builder evaluation edition (stand-alone version 5.0 or Studio '05 plug-in version 6.0), you get all of the source code. Windows CE has a web server (with support for a lot more that you're probably required to have) and the full source for it is part of that. It's a commercial-grade (though not my favorite) web server written completely in C. It targets Windows CE, but the code should compile and run fine under "big" Windows as well since it's using Win32 APIs.

The only limiation on the eval edition is you can only build OSes for 120 days, which has no impact on viewing the source.

ctacke
Needs to be for Linux, sorry for not stating that before.
Filip Ekberg
+1  A: 

Here is a list of small webservers, some of them are written in C Tiny web servers

And here is the code for a very small one, I posted the code here because the website is unavailable.

/* Copyright (C) 2007 Cosmin Gorgovan <[email protected]>

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, version 2 of the License.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. */

/* qshttpd is a lightweight http server. It was tested only under Linux.
It is quite fast when handling small files, actually about 6 times faster
then Apache. I think it is useful to serve static content from your site. 
Home page: www.linux-geek.org/qshttpd/ */

/* Version 0.3.0 - alpha software
See qshttpd.conf for a configuration example. */

/* TODO: logging, virtual hosts */

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/wait.h>
#include <signal.h>
#include <pwd.h>
#include <grp.h>

#define BACKLOG 10

//Variables used in get_conf().
char conf[5], dir[500], port[10], charset[200], user[100], group[100];
int portf;

//Sockets stuff
int sockfd, new_fd;
struct sockaddr_in their_addr;
socklen_t sin_size;
struct sigaction sa;

//Other global variables
int buffer_counter;
char * buffer;
FILE *openfile;

void read_chunk() {
    fread (buffer,1,1048576,openfile);
    buffer_counter++;
}

void sigchld_handler(int s)
{
    while(waitpid(-1, NULL, WNOHANG) > 0);
}

//Chroot and change user and group to nobody. Got this function from Simple HTTPD 1.0.
void drop_privileges() {
    struct passwd *pwd;
    struct group *grp;

    if ((pwd = getpwnam(user)) == 0)
    {
     fprintf(stderr, "User not found in /etc/passwd\n");
     exit(1);
    }

    if ((grp = getgrnam(group)) == 0)
    {
     fprintf(stderr, "Group not found in /etc/group\n");
     exit(1);
    }
    if (chdir(dir) != 0)
    {
     fprintf(stderr, "chdir(...) failed\n");
     exit(1);
    }

    if (chroot(dir) != 0)
    {
     fprintf(stderr, "chroot(...) failed\n");
     exit(1);
    }

    if (setgid(grp->gr_gid) != 0)
    {
     fprintf(stderr, "setgid(...) failed\n");
     exit(1);
    }

    if (setuid(pwd->pw_uid) != 0)
    {
     fprintf(stderr, "setuid(...) failed\n");
     exit(1);
    }

}

void get_conf() {
    FILE *conffile;
    conffile = fopen ("/etc/qshttpd.conf", "r");

    while (fgets (conf , 6, conffile)) {
    if (strcmp (conf, "ROOT=") == 0){
        fgets (dir, 500, conffile);
        strtok(dir, "\n");
    }
    if (strcmp (conf, "PORT=") == 0){
        fgets (port, 10, conffile);
        portf=atoi(port);
    }
    if (strcmp (conf, "CHAR=") == 0){
        fgets (charset, 200, conffile);
        strtok(charset, "\n");
    }
    if (strcmp (conf, "USER=") == 0){
        fgets (user, 100, conffile);
        strtok(user, "\n");
    }
    if (strcmp (conf, "GRUP=") == 0){
        fgets (group, 100, conffile);
        strtok(group, "\n");
    }
    } 
    fclose (conffile);
}

void create_and_bind() {
    int yes=1;
    struct sockaddr_in my_addr;

    if ((sockfd = socket(PF_INET, SOCK_STREAM, 0)) == -1) {
        perror("socket");
        exit(1);
    }

    if (setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&yes,sizeof(int)) == -1) {
        perror("setsockopt");
        exit(1);
    }

    my_addr.sin_family = AF_INET;
    my_addr.sin_port = htons(portf);
    my_addr.sin_addr.s_addr = INADDR_ANY;
    memset(&(my_addr.sin_zero), '\0', 8);

    if (bind(sockfd, (struct sockaddr *)&my_addr, sizeof(struct sockaddr)) == -1) {
        perror("bind");
        exit(1);
    }

    drop_privileges();

    if (listen(sockfd, BACKLOG) == -1) {
        perror("listen");
        exit(1);
    }

    sa.sa_handler = sigchld_handler;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_RESTART;
    if (sigaction(SIGCHLD, &sa, NULL) == -1) {
        perror("sigaction");
        exit(1);
    }
}

int main(void)
{
    char in[3000],  sent[500], code[50], file[200], mime[100], moved[200], length[100], auth[200], auth_dir[500], start[100], end[100];
    char *result=NULL, *hostname, *hostnamef, *lines, *ext=NULL, *extf, *auth_dirf=NULL, *authf=NULL, *rangetmp;
    int buffer_chunks;
    long filesize, range=0;

    get_conf();
    create_and_bind();

    //Important stuff happens here.

    while(1) {
        sin_size = sizeof(struct sockaddr_in);
        if ((new_fd = accept(sockfd, (struct sockaddr *)&their_addr, &sin_size)) == -1) {
            perror("accept");
            continue;
        }

        if (!fork()) {
            close(sockfd);
        if (read(new_fd, in, 3000) == -1) {
     perror("recive");
        } else {
     lines = strtok(in, "\n\r");
     do {
         hostname = strtok(NULL, "\n\r");
      if (hostname[0] == 'R' && hostname[1] == 'a' && hostname[2] == 'n' && hostname[3] == 'g' && hostname[4] == 'e') {
       rangetmp = hostname;
       strcpy(code, "206 Partial Content");
      }
     } while (hostname[0] != 'H' || hostname[1] != 'o' || hostname[2] != 's' || hostname[3] != 't');
     hostnamef = strtok(hostname, " ");
     hostnamef = strtok(NULL, " ");
     result = strtok(lines, " ");
     result = strtok(NULL, " ");
     if (strcmp(code, "206 Partial Content") == 0 ) {
      rangetmp = strtok(strpbrk(rangetmp, "="), "=-");
      range = atoi(rangetmp);
     }

     strcpy(file, result);
     if (opendir(file)){
         if (file[strlen(file)-1] == '/'){
                    strcat(file, "/index.html");
      openfile=fopen (file, "r");
                        if (openfile){
                            strcpy(code, "200 OK");
                        } else {
          //Here should be some kind of directory listing
          strcpy(file, "/404.html");
          openfile = fopen (file, "r");
          strcpy(code, "404 Not Found");
      }
         } else {
      strcpy(code, "301 Moved Permanently");
      strcpy(moved, "Location: http://");
      strcat(moved, hostnamef);
      strcat(moved, result);
      strcat(moved, "/");
         }
     } else {
         openfile=fopen (file, "rb");
                if (openfile){
      if (strlen(code) < 1) {
                     strcpy (code, "200 OK");
      }
                } else {
      strcpy(file, "/404.html");
      openfile = fopen (file, "r");
                    strcpy(code, "404 Not Found");
                }
                }
        }
        if (strcmp(code, "301 Moved Permanently") != 0){
     fseek (openfile , 0 , SEEK_END);
                filesize = ftell (openfile);
         rewind (openfile);
     if (range > 0) {
      sprintf(end, "%d", filesize);
      filesize = filesize - range;
      sprintf(start, "%d", range);
      fseek (openfile , range , SEEK_SET);
     }
     buffer_chunks = filesize/1048576;
     if(filesize%1048576 > 0){
      buffer_chunks++;
     }
     sprintf(length, "%d", filesize);
     buffer_counter = 0;
     buffer = (char*) malloc (sizeof(char)*1048576);
        }

        if (strcmp(code, "404 Not Found") != 0 && strcmp(code, "301 Moved Permanently") !=0){
     ext = strtok(file, ".");
            while(ext != NULL){
         ext = strtok(NULL, ".");
             if (ext != NULL){
      extf = ext;
         }
     }
        } else {
     extf="html";
        }

        /* Maybe I should read mime types from a file. At least for now, add here what you need.*/

        if (strcmp(extf, "html") == 0){
     strcpy (mime, "text/html");
            } else if(strcmp(extf, "jpg") == 0){
     strcpy (mime, "image/jpeg");
        } else if(strcmp(extf, "gif") == 0){
     strcpy (mime, "image/gif");
        } else if(strcmp(extf, "css") == 0){
     strcpy (mime, "text/css");
        } else {
     strcpy(mime, "application/octet-stream");
        }

        strcpy(sent, "HTTP/1.1 ");
        strcat(sent, code);
        strcat(sent, "\nServer: qshttpd 0.3.0\n");
        if(strcmp(code, "301 Moved Permanently") == 0){
     strcat(sent, moved);
     strcat(sent, "\n");
        }

        strcat(sent, "Content-Length: ");
        if(strcmp(code, "301 Moved Permanently") != 0){
            strcat(sent, length);
        } else {
     strcat(sent, "0");
        }
        if(strcmp(code, "206 Partial Content") == 0) {
     strcat(sent, "\nContent-Range: bytes ");
     strcat(sent, start);
     strcat(sent, "-");
     strcat(sent, end);
     strcat(sent, "/");
     strcat(sent, end);
    }
        strcat(sent, "\nConnection: close\nContent-Type: ");
        strcat(sent, mime);
        strcat(sent, "; charset=");
        strcat(sent, charset);
        strcat(sent, "\n\n");
        write(new_fd, sent, strlen(sent));
     while (buffer_counter < buffer_chunks) {
      read_chunk();
      write(new_fd, buffer, 1048576);
     }
        close(new_fd);
            exit(0);
        }
        close(new_fd);
    }
    return 0;
}
codeassembly
I don't like that code. But thanks for the example, im looking for something more commented, more structured and more explaining i guess.
Filip Ekberg
It is pretty clear... added to favorites thanks!
DFectuoso
+12  A: 

When I had a similar assignment in a network programming course I used Beej's Guide to Network Programming. I only had to write a simple server that handled GET and POST requests, but Beej's Guide got me all the way through the project.

Bill the Lizard
I like that very much, thank you!
Filip Ekberg
You're welcome. Good luck with your project.
Bill the Lizard
+3  A: 

It's strange that Apache was not mentioned here.
It's might be a little bit to complex for your purpose, but it's definitely one of the most impressive pieces of C code that you can learn from including socket programming and more advanced topics.
BTW i'm not 100% sure, but as far as know Apache is single threaded application and include built in scheduler for efficiency.

Ilya
That's not a tutorial really :/
Filip Ekberg
It is very well documented and one of the best web servers... So as i stated it's might be a bit complex, but if you want to learn best practice this is the place.
Ilya
+2  A: 

See this question/answer here for a bunch more.

There's serveral ways to architect multithreaded socket servers, but one I've found to be successfull/scalable is to use two thread pools. Have 1 or more threads listening for incoming connections, and have them pass those connection into a queue of work items. You then have a thread pool with a bunch of threads that process the queue. As soon as a thread is done processing one request, it goes back to the queue and gets the next request. This sets things up as a classic producer/consumer deal which is generally a solved problem in multithreaded libraries.

Having the connection accepted immediately lets the client know that the server is not down even when it's overloaded, and you can keep an eye on the length of the work queue to know when you need to add new hardware.

You could also do the same thing with a pool of processes using fork insteads of threads.

Here's a paper describing a pipelined multithreaded web-server, but it also outlines a variety of other standard architectures.

Eclipse
I know basics of sockets and as i stated i know how HTTP 1.1 works. What i do need which that thread doesnt supply is guidance in the matter of Structure, Webserver specific structure and guidance and good tutorials on how to create a socket app that uses multi-threads.
Filip Ekberg
Ah - I see what you're getting at.
Eclipse
+1  A: 

Advanced Programming in the Unix Environment comes in handy for a lot of the syscalls on linux.

Calyth
Already got that. That's more generaly though than i wanted..
Filip Ekberg