tags:

views:

293

answers:

3

Here's my scenario:

I need to generate XML via Perl in which the schema is full of <xs:sequence> tags (i.e. the tags MUST appear in order). I don't have control over the schema (third party) and there have been so many problems whenever we add new CPAN modules (don't have a good way to propagate them to customers, etc) that we've basically been forbidden from adding anything new (like XML::Writer).

XML Modules at my disposal are: XML::Parser, XML::Simple, XML::XPath.

I really like the way in XML::Simple you create a hashref w/hash/arary refs data structure and then just spit out the XML.

Is there anyway to do this with XML::Simple? Or perhaps roll my own code for spitting out the XML in order? Seems like my biggest problem is that I'd need to pull things from the hashref in insertion order which Perl doesn't really do all that well. I did read about Tie::IxHash for pulling things out in insertion order, but again, a module I don't have.

It feels like I'm kind of SOL, but would definitely appreciate any tricks/ideas that somebody might have. Thanks.

+1  A: 

You could still use `XML::Simple' and provide a hook method that does the sorting. I know, it's ugly and you'd rather have something that doesn't produce additional code.

innaM
Yeah, I just noticed this as well. This is probably what I'll end up doing (especially since I only realized the elements HAD to be ordered AFTER I wrote up the whole thing using `XML::Simple`).
Morinar
+5  A: 

Most of the time configuration can be done with just options that XML::Simple provides. Normally, it will automatically fold data down into as simple of a data structure as can be logically reproduced; it's great for data storage formats, but not as strong when it comes to matching a document format. Fortunately, even though it's "simple", it's incredibly powerful.

To control the order of output of elements, you have a few options. You can use arrays, which guarentee order of data. However, it looks like you need a particular order of values for one tag.

Key sorting is an automatic feature, too. As long as the keys you have are alphabetical, they will guaranteed sorted in that particular order.

But many times, especially with very specific schemas, that won't do. Fortunately, XML::Simple still supports a way to customize it: you must use the OO interface, and override the sorted_keys method. Here's an example:

use strict;
use warnings;
use XML::Simple;
use Data::Dumper;

package MyXMLSimple;      # my XML::Simple subclass 
use base 'XML::Simple';

# Overriding the method here
sub sorted_keys
{
   my ($self, $name, $hashref) = @_;
   if ($name eq 'supertag')   # only this tag I care about the order;
   {
      return ('tag1', 'tag3','tag4','tag10'); # so I specify exactly the right order.
   }
   return $self->SUPER::sorted_keys($name, $hashref); # for the rest, I don't care!
}

package main; # back to main code

my $xmlParser = MyXMLSimple->new(      # Create the parser, with options:
                  KeepRoot => 1,       # gives us our root element always.
                  ForceContent => 1,   # ensures that content stays special
               );

my $structure = { 
   supertag => { 
      tag1  => { content => 'value 1' },
      tag10 => { content => 'value 2' },
      tag3  => { content => 'value 3' },
      tag4  => { content => 'value 4' },
   },
};

my $xml = $xmlParser->XMLout($structure);
print "The xml generated is:\n$xml\n";
print "The read in structure then is:\n" . $xmlParser->XMLin($xml) . "\n";

This will give us:

The xml generated is:
<supertag>
  <tag1>value 1</tag1>
  <tag3>value 3</tag3>
  <tag4>value 4</tag4>
  <tag10>value 2</tag10>
</supertag>

The read in structure then is:
$VAR1 = {
          'supertag' => {
                          'tag10' => {
                                       'content' => 'value 2'
                                     },
                          'tag3' => {
                                      'content' => 'value 3'
                                    },
                          'tag1' => {
                                      'content' => 'value 1'
                                    },
                          'tag4' => {
                                      'content' => 'value 4'
                                    }
                        }
        };

Check out the XML::Simple page on CPAN.

Robert P
Hmm... I've read the CPAN page up and down, but don't know that I ever really understood this piece of functionality. This definitely seems like what I need.
Morinar
Unsure how to turn that into: `<opt><supertag><tag>value 1</tag><tag>value 2</tag><tag>value 3</tag><tag>value 4</tag></supertag></opt>` which is what I actually want. Any pointers?
Morinar
Check out my new answer. :)
Robert P
Wow! That's a great piece of code to build from. Thanks! I'll let you know how it comes out.
Morinar
This was definitely the win. I came up with this relatively clever subroutine that adds elements to the structure and also adds them to a hash with a sorting value. Then the sorting value is used by the sorted_keys function. I now have my elements in perfect order with a very small amount of overhead. Thanks all!
Morinar
A: 

This code will produce the output you asked for in a comment:

use strict;
use warnings;
use XML::Simple;

my $structure = { 'supertag' => [
      'value 1',
      'value 2',
      'value 3',
      'value 4',
   ],
};

my $xml = XMLout($structure, GroupTags => { supertag => 'tag'});

print "The xml generated is:\n";
print $xml;
print "\n";

It generates:

The xml generated is:
<opt>
  <supertag>
    <tag>value 1</tag>
    <tag>value 2</tag>
    <tag>value 3</tag>
    <tag>value 4</tag>
  </supertag>
</opt>
cjm
Unfortunately, that won't work, as the XML is something more akin to: `<opt><supertag><tag1>value 1</tag1><tag2>value 2</tag2><tag3>value 3</tag3><tag4>value 4</tag4></supertag></opt>`, i.e. I need different tags to appear in a specific order. Unless I'm missing something, I would essentially have to put every tag I have into the GroupTags list, which is definitely something I'd prefer to avoid.
Morinar
Well, you can't do that with GroupTags. Each supertag can have only a single tagname under it.
cjm