tags:

views:

530

answers:

2

Here is the thing that I am trying to accomplish:

In broader sense, parse the XML data using a SAX parser and insert it into the appropriate database column in a MySQL table.

Here is sample Books.xml

 <?xml version="1.0" encoding="UTF-8"?>
<!--Sample XML file generated by XMLSpy v2009 sp1 (http://www.altova.com)--&gt;
<bks:books xsi:schemaLocation="urn:books Untitled1.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:bks="urn:books">
        <book id="String">
                <author>String</author>
                  <authorFirstName>String</authorFirstName>
                  <authorLastName>String</authorLastName>
                <title>String</title>
                   <titleNo>3</titleNo>
                <genre>String</genre>
                <offer>String</offer>
                   <price>3.14159E0</price>
                <pub_date>1967-08-13</pub_date>
                <review>String</review>
                  <reviewsratings></reviewratings>
        </book>
</bks:books>

Perl script that uses a SAX Parser:

#!usr/bin/perl -w

use XML::SAX::ParserFactory;
use MySaxHandler;
my $handler = MySaxHandler->new();
my $parser = XML::SAX::ParserFactory->parser(Handler => $handler);
$parser->parse_uri("books.xml")

SAX Parser Handler Module MySaxHandler.pm:

package MySaxHandler;
use Data::Dumper;
use base qw(XML::SAX::Base);
my $in_books = 0;
sub start_document{
    my($self,$data) = @_;
    print "Parsing Started:\n";
}
sub start_element {
    my ($self,$data) = @_;
    my %attribs = %{$data->{'Attributes'}}; foreach( keys( %attribs )) { print " $_ = " . $attribs{$_}->{Value} . "\n"; } # -> Prints values of attributes.
    print "Starting element: ".$data->{Name}."\n\t";
    $in_books++;
    }
sub end_element {
    my($self,$data) = @_;
    print "\t Ending element:".$data->{Name}."\n";
    $in_books--;
}
sub characters{
    my($self,$data) = @_;
     if($in_productOffering){
      print "\t Element Values:".$data->{Data}."\n";
     }
}
sub end_document{
    my($self,$data) = @_;
        print "Parsing Completed\n";
}
1;

Steps:

  1. Parse XML using XML::SAX parser
  2. For each node in the XML if it has child nodes then traverse through it and once I reach the leaf node then generate insert statement dynamically for inserting that node value into Database table's column; e.g., if I have XML leaf node as price and its parent node is offer then Perl script should insert value of price node in price column of offer table in database.
  3. I have a large XML file which has many nodes and each node in turn has many child nodes and grand child nodes.

Question:

  1. How can I insert values of element tags in books.xml into the appropriate MySQL database columns for. E.g. price value should go into offer table's price column while I am parsing through the XML using the SAX Parser ?

    OR

  2. How can I generate all inserts and once I have all the insert statements then go and connect to a MySQL database and just dump them?

Tricky Part:

Tricky part is that there are some data that have interdependencies. E.g. price node in offer parent node is also related to default price node and so while generating inserts statements we have to keep in mind that values are properly inserted into the databases, but we are not allowed to use InnoDB table of MySQL but the only engine we are allowed to use is MyISAM.

What are possible suggestions in Perl to work around these issues?

A: 

I think your problem is step two. Based on your description of the problem, it sounds like you need to collect more information before you insert a record. Instead of inserting a record once you reach a node, wait until you collect everything you need. That might mean adding the record to a queue, for instance.

That's all I can say based on the very limited information you've provided. In you want better advice, construct a small demonstration script, perhaps with sample data, that illustrates what you are trying to do. When I have these sorts of problems, I test ideas in small programs instead of the big, production code.

brian d foy
Right now am in the process of generating small program which would clearly explain what am trying to achieve. I will post my response as soon as am done with creating a sample script.
Rachel
Yes. My problem is step 2. I can parse the XML but the thing I want to accomplish is to generate insert statements dynamically while am parsing through the XML. I have created modules like Offer.pm which would give me insert statements but my query is about how can I dynamically invoke those module while am parsing the XML.
Rachel
A: 

From my pov you may try to populate the data first into a Perl internal data structure (either an object or into a hash).
I would then write the SQLs based on this data structure - thus you have decoupled the database and the XML parsing.
It depends on the amount of data, if you are better of to do a bulk insert or perform a commit after every item.

weismat
As mentioned am using SAX parser and so parsing is done as event gets fired. Now from this how can I generate SQLs Inserts for inserting into Database ?
Rachel
SAX parsing simply means that you need to keep the history of read elements yourself - thus you need to cache all relevant elements until you move the data to the database. For this task you need a hash as a cache - after every new record you may then create the SQL based on the hash/cache and null it after each logical element.
weismat