I'm a webdeveloper with an emphasis on server-side programming. What little I've tinkered with JavaScript, I've done with externally referenced files or event handlers, and the barest minimum of an initialising function call between <script> tags.
As such it came as a surprise to me about a week ago that the data between <script> tags is not commonly escaped. In fact... it can't be. Escaping it will throw a massive lolwut-ohnoez-wrench into the works of the JavaScript parser in, as far as I know, every browser on the face of the earth.
This leads us to the (IMO) clusterfuck that is having to use CDATA for documents with in-HTML JavaScript blocks to pass validation (in XHTML), which still breaks hilariously the moment you have ]]>
in your code for any arbitrary reason.
As something of an encoding/escaping purist, I get the twitches looking at this. And for several days I've now asked myself:
Why?
Who's idea was it to excempt <script> (and, for example, quite distinctly not the JS-event handlers like onclick
) from the otherwise holy rule of 'non-HTML stuff between HTML tags should be HTML escaped', and why? Is it a case of 'this just grew that way historically, it's botched now, deal with it', or did someone sit down and think up something I'm not seeing?
The same is true (though less obviously so) for CSS and the <style> tag.
Do we even know what prompted this - or is it a case of lost knowledge? My google-fu on this topic has been incredibly weak, and I've not found anything, but since this is actually bugging me in pathetically OCD ways, I'd love to hear explanations if anyone has any.