I could not find anything in the spec which says it should be. I have seen a couple of browsers setting their user-agents to non UTF8 encoded strings. There is however a Content-Type request header which specifies the media type (and charset), and I'm not sure if that is applicable only to the body of the request or the headers too.
A:
The HTTP header field values may contain characters other than ASCII characters:
message-header = field-name ":" [ field-value ]
field-name = token
field-value = *( field-content | LWS )
field-content = <the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, separators, and quoted-string>
See the Basic Rule for the definition of OCTET and TEXT:
OCTET = <any 8-bit sequence of data>
TEXT = <any OCTET except CTLs,
but including LWS>
But in general only ASCII characters are used for the field values as well.
Gumbo
2009-11-05 18:01:09
+1
A:
HTTP RFC defines header content as type *TEXT, which is define on or about page 15 as ISO-8859-1 except when the non ISO-8859-1 is encoded pursuant to RFC 2047.
archetypal
2009-11-05 18:07:33
+1. Note the reference to RFC2047 is widely considered an error as 2047 encoded-words explicitly cannot be included in such a context. The 2047 reference is removed from newer pending standards work on HTTP.
bobince
2009-11-05 18:15:04