views:

320

answers:

2

On the MSDN site there is an example of some C# code that shows how to make a web request with POST'ed data. Here is an excerpt of that code:

WebRequest request = WebRequest.Create ("http://www.contoso.com/PostAccepter.aspx ");
request.Method = "POST";
string postData = "This is a test that posts this string to a Web server.";
byte[] byteArray = Encoding.UTF8.GetBytes (postData); // (*)
request.ContentType = "application/x-www-form-urlencoded";
request.ContentLength = byteArray.Length;
Stream dataStream = request.GetRequestStream ();
dataStream.Write (byteArray, 0, byteArray.Length);
dataStream.Close ();
WebResponse response = request.GetResponse ();
...more...

The line marked (*) is the line that puzzles me. Shouldn't the data be encoded using the UrlEncode function than UTF8? Isn't that what application/x-www-form-urlencoded implies?

+1  A: 

The sample code is misleading, because ContentType is set to application/x-www-form-urlencoded but the actual content is plain text. application/x-www-form-urlencoded is a string like this:

name1=value1&name2=value2

The UrlEncode function is used to escape especial characters like '&' and '=' so a parser doesn't consider them as syntax. It takes a string (media type text/plain) and returns a string (media type application/x-www-form-urlencoded).

Encoding.UTF8.GetBytes is used to convert the string (media type application/x-www-form-urlencoded in our case) into an array of bytes, which is what the WebRequest API expects.

Max Toro
Can application/x-www-form-urlencoded include non ASCII characters? I interpreted this to mean no. http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1 or am I misunderstanding it?
Martin Smith
As Martin Smith notes above, `application/x-www-form-urlencoded` indicates the content has been encoded in a specific way. How does using Encoding.UTF8 address this?
rlandster
@Martin Smith: Don't know. Just use the UrlEncode function to encode the names and the values and you should be fine. I think the sample uses UTF8 because that is what literal strings in C# are.
Max Toro
@rlandster I'm pretty sure it doesn't and in the event that the postdata wasn't so simple you'd need to encode it using some other utility. I'm not sure though why you would use `Encoding.UTF8.GetBytes` rather than `Encoding.ASCII.GetBytes` though. Edit: I see Max has posted a possible explanation above.
Martin Smith
+1  A: 

As Max Toro indicated, the examples on the MSDN site are incorrect: a correct form POST requires the data to be URL encoded; since the data in the MSDN example does not contain any characters that would be changed by encoding, they are, in a sense, already encoded.

The correct code would have a System.Web.HttpUtility.UrlEncode call on the names and values of each name/value pair before combining them into the name1=value1&name2=value2 string.

This page was helpful: http://geekswithblogs.net/rakker/archive/2006/04/21/76044.aspx

rlandster