tags:

views:

530

answers:

2

I'm trying to output an xml file blog.xml as yaml, for dropping into vision.app, a tool for designing shopify e-commerce sites locally.

Shopify's yaml looks like this:

- id: 2
  handle: bigcheese-blog
  title: Bigcheese blog
  url: /blogs/bigcheese-blog
  articles:
    - id: 1
      title: 'One thing you probably did not know yet...'
      author: Justin
      content: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
      created_at: 2005-04-04 16:00
      comments:
        - 
          id: 1
          author: John Smith
          email: [email protected]
          content: Wow...great article man.
          status: published
          created_at: 2009-01-01 12:00
          updated_at: 2009-02-01 12:00
          url: ""
        - 
          id: 2
          author: John Jones
          email: [email protected]
          content: I really enjoyed this article. And I love your shop! It's awesome. Shopify rocks!
          status: published
          created_at: 2009-03-01 12:00
          updated_at: 2009-02-01 12:00
          url: "http://somesite.com/"
    - id: 2
      title: Fascinating
      author: Tobi
      content: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
      created_at: 2005-04-06 12:00
      comments:
  articles_count: 2
  comments_enabled?: true 
  comment_post_url: ""
  comments_count: 2
  moderated?: true

However, sample myxml looks like this:

       <article>
          <author>Rouska Mellor</author>
          <blog-id type="integer">273932</blog-id>
          <body>Worn Again are hiring for a new Sales Director.

      To view the full job description and details of how to apply click &quot;here&quot;:http://antiapathy.org/?page_id=83&lt;/body&gt;
          <body-html>&lt;p&gt;Worn Again are hiring for a new Sales Director.&lt;/p&gt;
      &lt;p&gt;To view the full job description and details of how to apply click &lt;a href=&quot;http://antiapathy.org/?page_id=83&amp;quot;&amp;gt;here&amp;lt;/a&amp;gt;&amp;lt;/p&amp;gt;&lt;/body-html&gt;
          <created-at type="datetime">2009-07-29T13:58:59+01:00</created-at>
          <id type="integer">1179072</id>
          <published-at type="datetime">2009-07-29T13:58:59+01:00</published-at>
          <title>Worn Again are hiring!</title>
          <updated-at type="datetime">2009-07-29T13:59:40+01:00</updated-at>
        </article>
        <article>

I naively assumed converting from one serialised data format to another was fairly straightforward, and I could simply do this:

>> require 'hpricot'
=> true
>> b = Hpricot.XML(open('blogs.xml'))
>> puts b.to_yaml

But I'm getting this error.

NoMethodError: undefined method `yaml_tag_subclasses?' for Hpricot::Doc:Class
    from /usr/local/lib/ruby/1.8/yaml/tag.rb:69:in `taguri'
    from /usr/local/lib/ruby/1.8/yaml/rubytypes.rb:16:in `to_yaml'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `call'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `emit'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `quick_emit'
    from /usr/local/lib/ruby/1.8/yaml/rubytypes.rb:15:in `to_yaml'
    from /usr/local/lib/ruby/1.8/yaml.rb:117:in `dump'
    from /usr/local/lib/ruby/1.8/yaml.rb:432:in `y'
    from (irb):6
    from :0
>>

How can I get the data output in the form outlined at the top of this question? I've tried importing the 'yaml' gem, thinking that I'm missing some of those methods, but that hasn't helped either:

+1  A: 

Hi! I've found this. Maybe it could help. http://brains.parslow.net/node/1623

dierre
Unfortunately not :( Sorry!
Josh
+1  A: 

Sorry, Josh, I think what you've found here is a limitation in the hpricot and/or the yaml libraries, pure and simple. I'm not sure hpricot's ever supported yaml in this way! The method in question is dynamically added by the yaml library to the Object class, as well as other fundamental Ruby types, but doesn't show up in Hpricot::Doc's definition for some reason, even though Hpricot::Doc does seem to inherit indirectly from Object.

I can say that I've reproduced it as well, so it's not just you.

You can very easily add the missing method:

class Hpricot::Doc
    def self.yaml_tag_subclasses?
        "true"
    end
end
b = Hpricot.XML(open('blogs.xml'))

but you'll find that doesn't get you much further. Here's what I get:

--- !ruby/object:Hpricot::Doc 
options: 
  :xml: true

So we're not iterating over the container like we should.

At this point, to get yaml support using the yaml library, the brute-force way (maybe the only way) would be to add to_yaml methods to hpricot's classes, to teach them how to output yaml correctly. Take a look at "/usr/lib/ruby/1.8/yaml/rubytypes.rb" (on a Mac, that'd be something like "/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/yaml/rubytypes.rb"for examples of how that's done for each of the fundamental Ruby types. The classes you might need to add this to are defined on the C side: see "hpricot/ext/hpricot_scan/hpricot_scan.rl", in the method Init_hpricot_scan.

Owen S.