



The code:

using (XmlReader xmlr = XmlReader.Create(new StringReader(allXml)))
    var items = from item in SyndicationFeed.Load(xmlr).Items
        select item;

The exception:

Exception: System.Xml.XmlException: Unexpected node type Element. 
   ReadElementString method can only be called on elements with simple or empty content. Line 11, position 25.
   at System.Xml.XmlReader.ReadElementString()
   at System.ServiceModel.Syndication.Rss20FeedFormatter.ReadXml(XmlReader reader, SyndicationFeed result)
   at System.ServiceModel.Syndication.Rss20FeedFormatter.ReadFeed(XmlReader reader)
   at System.ServiceModel.Syndication.Rss20FeedFormatter.ReadFrom(XmlReader reader)
   at System.ServiceModel.Syndication.SyndicationFeed.Load[TSyndicationFeed](XmlReader reader)
   at System.ServiceModel.Syndication.SyndicationFeed.Load(XmlReader reader)
   at Ionic.ToolsAndTests.ReadRss.Run() in c:\dev\dotnet\ReadRss.cs:line 90

The XML content:

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="" media="screen"?><rss version="2.0" 
  xmlns:atom="" >
  <title>Software architecture, software engineering, and Renaissance Jazz</title>
  <atom:link rel="self" type="application/rss+xml" href="" />
  <description>Software architecture, software engineering, and Renaissance Jazz</description>
  <copyright>Copyright <script type='text/javascript'> document.write( (1273534889181));</script></copyright>
  <lastBuildDate>Mon, 10 May 2010 19:41:29 -0400</lastBuildDate>

As you can see, on line 11, at position 25, there's a script block inside the <copyright> element.

Other people have reported similar errors with other XML documents.

The way I worked around this was to do a StreamReader.ReadToEnd, then do Regex.Replace on the result of that to yank out the script block, before passing the modified string to XmlReader.Create(). Feels like a hack.

  1. Has anyone got a better approach? I don't like this because I have to read in a 125k string into memory.

  2. Is it valid rss to include "complex content" like that - a script block inside an element?


No, noone has a better approach.

No, it sure doesn't look like valid RSS. at the very least it's bad manners.


You can subclass XmlTextReader and override ReadElementString to skip or modify the offending element as it's being read. Still feels like a hack but at least avoids the pre-processing with regex.

Here's a simple implementation that gets the job done:

class BrokenFeedXmlReader : XmlTextReader 
    // Additional XmlTextReader constructors can be added in 
    // similar fashion as needed
    public BrokenFeedXmlReader(TextReader input)
        : base(input)

    public override string ReadElementString()
        if ("copyright" == Name)
            return String.Empty; 

        return base.ReadElementString();

Your example code would then look something like this:

using (XmlReader xmlr = new BrokenFeedXmlReader(new StringReader(allXml)))
    var items = from item in SyndicationFeed.Load(xmlr).Items
                select item;