views:

128

answers:

6

Hey all. Java's XML parser seems to be thinking that my XML document is not well formed following the root element, but I've validated it with several tools and they all disagree. It's probably an error in my code rather than in the document itself, I'd really appreciate any help you all could offer me.

Here is my Java method:

private void loadFromXMLFile(File f) throws ParserConfigurationException, IOException, SAXException {
    File file = f;
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db;
    Document doc = null;
    db = dbf.newDocumentBuilder();
    doc = db.parse(file);
    doc.getDocumentElement().normalize();
    String desc = "";
    String due = "";
    String comment = "";
    NodeList tasksList = doc.getElementsByTagName("task");
    for (int i = 0; i  tasksList.getLength(); i++) {
        NodeList attributes = tasksList.item(i).getChildNodes();
        for (int j = 0; i < attributes.getLength(); j++) {
        Node attribute = attributes.item(i);
        if (attribute.getNodeName() == "description") {
            desc = attribute.getTextContent();
        }
        if (attribute.getNodeName() == "due") {
            due = attribute.getTextContent();
        }
        if (attribute.getNodeName() == "comment") {
            comment = attribute.getTextContent();
        }
        tasks.add(new Task(desc, due, comment));
        }
        desc = "";
        due = "";
        comment = "";
    }
}

And here is the XML file I'm trying to load:

<?xml version="1.0"?>  
<tasklist>  
    <task>  
        <description>Task 1</description>  
        <due>Due date 1</due>  
        <comment>Comment 1</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 2</description>  
        <due>Due date 2</due>  
        <comment>Comment 2</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 3</description>  
        <due>Due date 3</due>  
        <comment>Comment 3</comment>  
        <completed>true</completed>  
    </task>  
</tasklist>

And here is the error message java is throwing for me:

run:
[Fatal Error] tasks.xml:28:3: The markup in the document following the root element must be well-formed.
May 17, 2010 6:07:02 PM todolist.TodoListGUI <init>
SEVERE: null
org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.
        at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:239)
        at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:208)
        at todolist.TodoListGUI.loadFromXMLFile(TodoListGUI.java:199)
        at todolist.TodoListGUI.<init>(TodoListGUI.java:42)
        at todolist.Main.main(Main.java:25)
BUILD SUCCESSFUL (total time: 19 seconds)

For reference TodoListGUI.java:199 is

doc = db.parse(file);

If context is helpful to anyone here, I'm trying to write a simple GUI application to manage a todo list that can read and write to and from XML files defining the tasks.

Any advice is appreciated!

A: 

Try changing your XML declaration to:

<?xml version="1.0" encoding="UTF-8" ?>
EAMann
ive fixed his formatting, it does now
Andrew Bullock
And I only make the `encoding` suggestion because, as far as I can tell and test, you *already* have a well-formed XML document ... maybe there's something else going on in your code.
EAMann
I got nothing from trying that. It's still giving me the same error with or without the encoding type.
Pyroclastic
+1  A: 

I think there may be something wrong with the actual file. When I copy your code but use the XML as a string input to the parser it works fine (after fixing a couple of issues - attributes.item(i) should be attributes.item(j) and you need to break out of the loop when attribute == null).

In trying to reproduce your error, I can get the same message if I add another <tasklist></tasklist> element. This is because the XML no longer has a single root element (tasklist). Is this the problem you are seeing? Does the XML in tasks.xml have a single root element?

laz
A: 

For what it's worth, the Scala REPL successfully parsed your markup.

scala> val tree = <tasklist>
 | <task>
 | <description>Task 1</description>
 | <due>Due date 1</due>
 | <comment>Comment 1</comment>
 | <completed>false</completed>
 | </task>
 | <task>
 | <description>Task 2</description>
 | <due>Due date 2</due>
 | <comment>Comment 2</comment>
 | <completed>false</completed>
 | </task>
 | <task>
 | <description>Task 3</description>
 | <due>Due date 3</due>
 | <comment>Comment 3</comment>
 | <completed>true</completed>
 | </task>
 | </tasklist>
tree: scala.xml.Elem = 
<tasklist>
<task>
<description>Task 1</description>
<due>Due date 1</due>
<comment>Comment 1</comment>
<completed>false</completed>
</task>
<task>
<description>Task 2</description>
<due>Due date 2</due>
<comment>Comment 2</comment>
<completed>false</completed>
</task>
<task>
<description>Task 3</description>
<due>Due date 3</due>
<comment>Comment 3</comment>
<completed>true</completed>
</task>
</tasklist>
ewg
A: 

org.xml.sax.SAXParseException: The markup in the document following the root element must be well-formed.

This particular exception indicates that there is more than one root element in the XML document. In other words, the <tasklist> is not the only root element. To take your XML document as an example, think of one without the <tasklist> element and with three <task> elements in the root. This would cause this kind of exception.

Since the XML file you posted looks fine, the problem lies somewhere else. It look like that it is not parsing the XML file you expect that it is parsing. For quick debugging, add the following to top of your method:

System.out.println(f.getAbsolutePath());

Locate the file in the disk file system and verify it.

BalusC
A: 

Another for what its worth, here is what I get when I saved your xml into a file called test.xml and ran it thru xmllint.

[jhr@Macintosh] [~]
xmllint test.xml
<?xml version="1.0"?>
<tasklist>  
    <task>  
        <description>Task 1</description>  
        <due>Due date 1</due>  
        <comment>Comment 1</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 2</description>  
        <due>Due date 2</due>  
        <comment>Comment 2</comment>  
        <completed>false</completed>  
    </task>  
    <task>  
        <description>Task 3</description>  
        <due>Due date 3</due>  
        <comment>Comment 3</comment>  
        <completed>true</completed>  
    </task>  
</tasklist>

seems to be fine. most likely you have some stray characters that you can't see in there somewhere in your actual file. Try viewing the actual file in an editor that will show non-printable characters, like someone else suggested if this isn't an English UTF-8 machine you might have some Unicode characters that you can't see that the parser does. That or you aren't loading the file that you think you are. Step debugging and see what the actual contents of the file are before it gets fed into the parser.

fuzzy lollipop
A: 

Are you sure that's the everything in that file? The error is complaining that there are more markup after the current root. So there must be something else after </tasklist>.

Sometimes, this error may be caused by non-printable characters. If you don't see anything, do a hexdump of the file.

ZZ Coder