I have a file that consists of concatenated valid XML documents. I'd like to separate individual XML documents efficiently.
Contents of the concatenated file will look like this, thus the concatenated file is not itself a valid XML document.
<?xml version="1.0" encoding="UTF-8"?>
<someData>...</someData>
<?xml version="1.0" encoding="UTF-8"?>
<someData>...</someData>
<?xml version="1.0" encoding="UTF-8"?>
<someData>...</someData>
Each individual XML document around 1-4 KB, but there is potentially a few hundred of them. All XML documents correspond to same XML Schema.
Any suggestions or tools? I am working in the Java environment.
Edit: I am not sure if the xml-declaration will be present in documents or not.
Edit: Let's assume that the encoding for all the xml docs is UTF-8.