views:

336

answers:

4

What test text do you try and type into your web forms to check that they handle all the edge cases properly (esp unicode and xss style problems).

I am particularly interested in good unicode strings that maybe do something odd if they are mis-encoded when they are displayed again.

Text that contains potentially problematic characters, like quotes, <, > etc would also be interesting.

Thanks

A: 

Well, this is a bit of a brute force approach, but if you wanted to start from some well formed Unicode and add some errors, a great resources for the real stuff is here: http://www.unicode.org/charts.

John Lockwood
+3  A: 

Your idea of HTML-sensitive characters is a good start. I also like using characters that are kind of readable, but are still Unicode. When I was doing this kind of testing for tabblo.com, I used this string:

Testing «ταБЬℓσ»: 1<2 & 4+1>3, now 20% off!

This has HTML-sensitive characters, ASCII, upper-half ISO characters, and multi-byte Unicode characters.

Ned Batchelder
Not bad UTF-8 data though, which would be useful.
Alix Axel
+2  A: 

Turkey testing!

http://www.moserware.com/2008/02/does-your-code-pass-turkey-test.html

This is actually pretty advanced internationalization testing, not for the faint of heart, including date formatting, percent calculations, upper/lowercase translations, etc.

willoller
+1  A: 

These smilies from SuperUser.com are pretty cool for testing your unicode support as well...

http://superuser.com/questions/52671/how-do-i-create-unicode-smilies-like

٩(-̮̮̃-̃)۶ ٩(●̮̮̃•̃)۶ ٩(͡๏̯͡๏)۶ ٩(-̮̮̃•̃).

rikh