views:

346

answers:

2

I'm parsing the the following...

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE tox:message SYSTEM "http://tox.sf.net/tox/dtd/tox.dtd"&gt;
<tox:message xmlns:tox="http://tox.sourceforge.net/"&gt;
<tox:model owner="scott" package="queue" function="appendFact">
<tox:parameter value="  By John Smith   &ndash; Thu Feb 25, 4:54 pm ET&lt;br&gt;&lt;br&gt;NEW YORK (Reuters) &ndash; Nothing happened today."/>
<tox:parameter value="10245"/>
</tox:model>
</tox:message>

... using saxon9.jar, but got...

org.xml.sax.SAXParseException: The entity "ndash" was referenced, but not declared.

How do I "declare" an entity for a parse? How would I be able to anticipate all the potential entities?

A: 

You declare it in a DTD. Since you are using an external DTD, it has to declare it for you. Does tox.dtd contain a declaration for ndash?

If it does not, you need to do something inspired by:

<!DOCTYPE foo [
    <!ENTITY % MathML SYSTEM "http://www.example.com/MathML.dtd"&gt;
    %MathML;
    <!ENTITY % SpeechML SYSTEM "http://www.example.com/SpeechML.dtd"&gt;
    %SpeechML;
]>

You could use one of the standard XHTML dtds that defines ndash, for example.

If tox.dtd does declare it, then you need a resolver to find it.

bmargulies