tags:

views:

24

answers:

3

Hi, I'm making a RSS feed. I was looking at the official example and I noticed that some characters, such as < and > were replaced with &gt; and &lt;.

I therefore assume that & must also be replaced with &amp;.

Are there other characters that I must escape before copying them in the description? Note that the description text comes from an untrusted source, so they should never be able to "break out" of the description tag or making the RSS feed invalid.

I don't think it matters but the encoding is utf8.

+1  A: 

Microsoft Support lists the following:

  • Ampersand, &
  • Left angle bracket, <
  • Right angle bracket, >
  • Straight quotation mark, "
  • Apostrophe, '
Anders Lindahl
+1  A: 

An RSS feed a a special type of XML document. See this XML spec for the list of special characters.

JRL
+1  A: 

Most people should never use string manipulation to construct XML documents.

Use your programming language's XML library; it will automatically formulate well-formed XML for you. The authors of that XML library read the XML recommendation extremely closely so that you don't have to. Failing to encode character entities is one way you can emit badly-formed XML without intending to, but it's far from the only one.

Robert Rossney