I had a regex as the first line of defense against XSS.
public static function standard_text($str)
{
// pL matches letters
// pN matches numbers
// pZ matches whitespace
// pPc matches underscores
// pPd matches dashes
// pPo matches normal puncuation
return (bool) preg_match('/^[\pL\pN\pZ\p{Pc}\p{Pd}\p{Po}]++$/uD', (string) $str);
}
It is actually from Kohana 2.3.
This runs on public entered text (no HTML ever), and denies the input if it fails this test. The text is always displayed with htmlspecialchars()
(or more specifically, Kohana's flavour, it adds the char set amongst other things). I also put a strip_tags()
on output (even though I know it can ruin stuff like: 5 < 3!! :>
).
The client had a problem when he wanted to enter some text with parenthesis. I thought about modifying or extending the helper, but I also had a secondary thought - if I allow double quotes, is there really any reason why I need to validate at all?
Can I just rely on the escaping on output?