views:

133

answers:

3

I could not find anything in the spec which says it should be. I have seen a couple of browsers setting their user-agents to non UTF8 encoded strings. There is however a Content-Type request header which specifies the media type (and charset), and I'm not sure if that is applicable only to the body of the request or the headers too.

A: 

The HTTP header field values may contain characters other than ASCII characters:

message-header = field-name ":" [ field-value ]
field-name     = token
field-value    = *( field-content | LWS )
field-content  = <the OCTETs making up the field-value
                 and consisting of either *TEXT or combinations
                 of token, separators, and quoted-string>

See the Basic Rule for the definition of OCTET and TEXT:

OCTET          = <any 8-bit sequence of data>
TEXT           = <any OCTET except CTLs,
                 but including LWS>

But in general only ASCII characters are used for the field values as well.

Gumbo
A: 

The Content-Type header applies to the body, not the headers.

ss
+1  A: 

HTTP RFC defines header content as type *TEXT, which is define on or about page 15 as ISO-8859-1 except when the non ISO-8859-1 is encoded pursuant to RFC 2047.

archetypal
+1. Note the reference to RFC2047 is widely considered an error as 2047 encoded-words explicitly cannot be included in such a context. The 2047 reference is removed from newer pending standards work on HTTP.
bobince