views:

3628

answers:

3

I've had a new found interest in building a small, efficient web server in C and have had some trouble parsing POST methods from the HTTP Header. Would anyone have any advice as to how to handle retrieving the name/value pairs from the "posted" data?

POST /yeah HTTP/1.1
Host: cor.cs.uky.edu:7017
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://cor.cs.uky.edu:7017/cs316post.html
Cookie: __utma=43166241.217413299.1220726314.1221171690.1221200181.16; __utmz=43166241.1220726314.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none)
Cache-Control: max-age=0
Content-Type: application/x-www-form-urlencoded
Content-Length: 25

field1=asfd&field2=a3f3f3 <-- This

I see no tangible way to retrieve the bottom line as a whole and ensure that it works every time. I'm not a fan of hard-coding in anything.

Thanks in advance!

+7  A: 

You can retrieve the name/value pairs by searching for newline newline or more specifically \r\n\r\n (after this, the body of the message will start).

Then you can simply split the list by the &, and then split each of those returned strings between the = for name/value pairs.

See the HTTP 1.1 RFC.

Brian R. Bondy
Ah, thanks. I noticed there was an extra space right before the string of name/value pairs, but didn't put two and two together.
treefrog
@rofly: do not compute two and two, just read the standard (RFC 2616). It's in section 4.1.
bortzmeyer
+1  A: 

You need to keep parsing the stream as headers until you see the blank line. The rest is the POST data.

You need to write a little parser for the post data. You can use C library routines to do something quick and dirty, like index, strtok, and sscanf. If you have room for it in your definition of "small", you could do something more elaborate with a regular expression library, or even with flex and bison.

At least, I think this kind of answers your question.

jfm3
+2  A: 

Once you have Content-Length in the header, you know the amount of bytes to be read right after the blank line. If, for any reason (GET or POST) Content-Length is not in the header, it means there's nothing to read after the blank line (crlf).