tags:

views:

611

answers:

2

I am opening XML file that refers to a DTD as follows:

<?xml version="1.0" encoding="windows-1250"?>
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN"
     "http://my.netscape.com/publish/formats/rss-0.91.dtd"&gt;

Here is the part of C# code:

public static XmlDocument FromUri(string uri) 
 {

    XmlDocument xmlDoc;
    WebClient webClient = new WebClient();

    using (Stream rssStream = webClient.OpenRead(uri))
    {
        XmlTextReader reader = new XmlTextReader(rssStream);
        xmlDoc = new XmlDocument();
        xmlDoc.XmlResolver = null;
        xmlDoc.Load(reader);
    }
    return xmlDoc;
}

When I try to Load 'reader' I get the following error: Expected DTD markup was not found. Is there any way to get the parser to ignore the Doctype element? Or maybe, I can do something more efficient?

A: 

http://my.netscape.com/publish/formats/rss-0.91.dtd leads to a 301 that, in turn, goes to http://netscape.aol.com/index.html

i.e. There is no DTD at this URL.

Stephane
+1  A: 

As long as the DTD doesn't define any &entities; that you need to use (use character references instead!), you can tell XmlTextReader not to include external entities (including the DTD) by setting XmlResolver to null.

(This should have been the default really. Most times you're reading an XML document you don't want it heading off to download a DTD, even when the DTD is still present. In this case AOL have behaved particularly badly by not only removing the DTD, but serving an incorrect 301 response to some HTML instead of the appropriate 404.)

bobince
Thanks a lot, bobince!
Nikolan