tags:

views:

951

answers:

4

I need to get data from an XML file and store it into a MySQL Database. I am thinking of using a SAX Parser for parsing the data but I am not sure of how to store data efficiently into database, I am thinking of few technologies like JDBC and Hibernate but I wanted to ask about what would be the efficient way of doing it?

Note: Here programming language is Java.

+2  A: 

I would suggest using JAXB for reading in the XML to Java objects and JPA for writing them into the database. You can create a single data model using Java classes that have both annotations for XML binding using JAXB and database persistence annotations using JPA.

@Entity
@Table(name="Foo")
@XmlRootElement
public class Foo {
    // ...
}

Information on JAXB annotations. Information on JPA.

Chris Dail
I am new to both JAXB and JPA, where would you recommend me to go and get relevant information useful for my task.
Rachel
Blaise Doughan
A: 

It depends on many factors. If your XML is too large ( > 1GB or comparable with your total memory), then you should use SAX and I don't think there would be other solutions. If it's small (say smaller than 100MB), simply load the whole XML into a Document object using JAXP:

DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder parser = documentBuilderFactory.newDocumentBuilder();
Document doc = parser.parse(source);

You probably have elements or attributes mappied to columns on DB. You can then query elements/attrs using XPath for simplicity and write them to DB. It this is a one-time conversion, I recommend using simple JDBC. Don't think for JPA or Hibernate, as it just increases your development time for a routine data conversion scenario.

Mohsen
@Mohsen: It is not one time conversion meaning one it is stored in MySQL than I will in future might have to change it depending upon business/decision rules changes, so still with my requirement JDBC would be right option to go for or use Hibernate or JPA or some other ORM technology like TopLink, Ibatis would be useful ?
Rachel
+1  A: 

You could use Castor witch is an open source data binding framework for moving data from XML to Java programming language objects and from Java to databases.

I found also article series in IBM developerWorks that describe using Castor suited to your needs.

cetnar
@Cetnar: Thank you very much for the information. I will go through it.
Rachel
@Cetnar: Does it scale up well for large set of Data (>1GB)
Rachel
@Rachel: Do you have large (>1GB) xml files? If performance is an issue you can try others JAX libriaries like http://www.jibx.org/ or parse xml with StAX parser. Here id link to performance tests of similar libraries http://www.ibm.com/developerworks/xml/library /x-databdopt2/ - bittly outdates I suppouse
cetnar
@Cetnar: My xml file is larger than 1GB and yes performance is the major issue also if am using castor than I can do mapping of java objects in to MySQL using it but if am planning to use jibx or StAX than what would be best approach to map java objects into MySQL ? Hibernate/JPA/JDBC or which technology will be efficient for my concern ?
Rachel
If you plan only move (and modify on the fly) data from xml do database in my opinion plain JDBC will be most efective from perfomance view.
cetnar
A: 

You may store XML into mySQL directly using blob... if you want efficient indexing and high performance, VTD-XML has built-in ability to index/query/update XML document, making it a better alternative than SAX and DOM, here is a link to a related article

Index XML documents with VTD-XML

vtd-xml-author
I have to do some kind of operations on XML depending upon decision rules and so will this approach be useful in my concern ?
Rachel
if SAX or DOM is what you plan to use, then VTD-XML should be an improvement... so yes, it is just a better parser with built-in indexing
vtd-xml-author