tags:

views:

50

answers:

2

Hey,

I'm getting this error with following line:

Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new InputSource(new StringReader(html.toString())));

Details:

java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/html4/strict.dtd
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1290)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:677)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1315)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1282)
    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:283)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1192)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1089)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:1002)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:807)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:107)
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:225)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283)
    at concurrency.Worker.run(Worker.java:56)
    at java.lang.Thread.run(Thread.java:619)

Does anyone have clue what might be the problem?

+1  A: 

Ok I've found solution.

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
Document doc = factory.newDocumentBuilder().parse(new InputSource(new StringReader(html.toString())));
l245c4l
A: 

The problem was that the W3 server was down at the time. An HTTP response code of 503 means "Service Unavailable".

What you have done is to tell the DOM parser not to try to fetch external DTDs. This effectively disables validation against any DTDs that your application cannot find locally. I believe this also has the side effect of not populating your DOM with default values for attributes.

The best solution would be to fetch copies of any external DTDs that your application is likely to use and either wire them into your application, or store them in a permanent local cache. This insulates your application against external server downtime, and also reduces the load on the W3 infrastructure.

Another alternative would be to configure your application to use a caching web proxy.

Stephen C