Hello,
I'm tring to create form validation unit that, in addition to "regular" tests checks encoding as well.
According to this article http://www.w3.org/International/questions/qa-forms-utf-8 the allowed characters are CR, LF and TAB in range of 0-31, the DEL=127 in not allowed.
On the other hand, there are control characters in range 0x80-0xA0. In different sources I had seen that they are allowed and that not. Also I had seen that this is different for XHTML, HTML and XML.
Some articles had told that FF is allowed as well?
Can someone provide a good answer with sources what can be given and what isn't?
EDIT: Even there http://www.w3.org/International/questions/qa-controls some ambiguity
The C1 range is supported
But table shows that they are illegal and previous shown UTF-8 validations allows them?