tags:

views:

830

answers:

2

In my C# project, I have been dealt with the task of parsing an SGML file and have tried, very naively, to use XmlReader, and this has led to some interesting revelations (i.e., the difference between SGML and well-formed XML, etc.)

So I am thinking that I just need a good SGML parser which converts it to an XML file and go from there. In my search, I have found two SGML parsers that can integrate with my C# project:

Any other recommendations?

+2  A: 

Apparently SgmlReader's updated here:

http://developer.mindtouch.com/Community/SgmlReader

GP
A: 

HTML is an implementation of SGML. If you want to parse HTML properly, you will need an SGML parser. SGMLreader appears to fit those needs well, and I plan to use it myself. I would suggest using HTML tidy. It is a native application, but .net bindings for it do exist. If you need entirely managed code, then the SGMLreader is the way to go.

Keith
Agreed. I've since been using SgmlReader and it has worked pretty well.
GP