Here is the thing that I am trying to accomplish:
In broader sense, parse the XML data using a SAX parser and insert it into the appropriate database column in a MySQL table.
Here is sample Books.xml
<?xml version="1.0" encoding="UTF-8"?>
<!--Sample XML file generated by XMLSpy v2009 sp1 (http://www.altova.com)-->
<bks:books xsi:schemaLocation="urn:books Untitled1.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:bks="urn:books">
<book id="String">
<author>String</author>
<authorFirstName>String</authorFirstName>
<authorLastName>String</authorLastName>
<title>String</title>
<titleNo>3</titleNo>
<genre>String</genre>
<offer>String</offer>
<price>3.14159E0</price>
<pub_date>1967-08-13</pub_date>
<review>String</review>
<reviewsratings></reviewratings>
</book>
</bks:books>
Perl script that uses a SAX Parser:
#!usr/bin/perl -w
use XML::SAX::ParserFactory;
use MySaxHandler;
my $handler = MySaxHandler->new();
my $parser = XML::SAX::ParserFactory->parser(Handler => $handler);
$parser->parse_uri("books.xml")
SAX Parser Handler Module MySaxHandler.pm
:
package MySaxHandler;
use Data::Dumper;
use base qw(XML::SAX::Base);
my $in_books = 0;
sub start_document{
my($self,$data) = @_;
print "Parsing Started:\n";
}
sub start_element {
my ($self,$data) = @_;
my %attribs = %{$data->{'Attributes'}}; foreach( keys( %attribs )) { print " $_ = " . $attribs{$_}->{Value} . "\n"; } # -> Prints values of attributes.
print "Starting element: ".$data->{Name}."\n\t";
$in_books++;
}
sub end_element {
my($self,$data) = @_;
print "\t Ending element:".$data->{Name}."\n";
$in_books--;
}
sub characters{
my($self,$data) = @_;
if($in_productOffering){
print "\t Element Values:".$data->{Data}."\n";
}
}
sub end_document{
my($self,$data) = @_;
print "Parsing Completed\n";
}
1;
Steps:
- Parse XML using XML::SAX parser
- For each node in the XML if it has child nodes then traverse through it and once I reach the leaf node then generate insert statement dynamically for inserting that node value into Database table's column; e.g., if I have XML leaf node as
price
and its parent node isoffer
then Perl script should insert value ofprice
node inprice
column ofoffer
table in database. - I have a large XML file which has many nodes and each node in turn has many child nodes and grand child nodes.
Question:
How can I insert values of element tags in books.xml into the appropriate MySQL database columns
for. E.g. price value should go into offer table's price column
while I am parsing through the XML using the SAX Parser ?OR
How can I generate all inserts and once I have all the insert statements then go and connect to a MySQL database and just dump them?
Tricky Part:
Tricky part is that there are some data that have interdependencies. E.g. price
node in offer
parent node is also related to default price
node and so while generating inserts statements we have to keep in mind that values are properly inserted into the databases, but we are not allowed to use InnoDB table of MySQL but the only engine we are allowed to use is MyISAM.
What are possible suggestions in Perl to work around these issues?