I use a xsl tranform to convert a xml file to html in dotNet. I transform the node values in the xml to html tag contents and attributes.
I compose the xml by using .Net DOM manipulation, setting the InnerText property of the nodes with the arbitrary and possibly malicious text. Right now, maliciously crafted input strings will make my html unsafe. Unsafe in the sense that some javascript might come from the the user and find its way to a link href attribute in the output html, for example.
The question is simple, what is the sanitizing, if any, that I have to do with my text before assigning it to the InnerText property? I thought that assigning to InnerText instead of InnerXml would do all the needed sanitization of the text, but that seems to not be the case.
Does my transform have to have any special characteristics to make this work safely? Any .net specific caveats that I should be aware?
Thanks!