Is there a input class to deal with [multiple] large XML files based on their tree structure in Hadoop? I have a set of XML files that are of the same schema, but I need to split them into sections of data, as opposed to breaking the sections up.
For example the XML file would be:
<root>
<parent> data </parent>
<parent> more data</parent>
<parent> even more data</parent>
</root>
I would define each section as: /root/parent.
What I'm asking is: Is there a record input reader already included for Hadoop to do this?