tags:

views:

238

answers:

1

I'm looking for a way to parse an xml/html document in ruby, which contains ERB style tags <% %> with ruby code inside. REXML, the built in XML parser won't allow me to do this.

I'm aware that I might be able to with a third party library like hpricot, but I'd like to avoid any external dependencies.

Is there a way I could get REXML to be less strict about the tags? or to make it recognize this tag? Any other solution?

+4  A: 

Well, provided that you want the actual Ruby code itself, your problem is not with the parser, but the fact that your XML is malformed.

I'm still assuming your XML looks something like this:

<parent>
    <node>
         <% some code here! %>
    </node>
</parent>

If that is indeed the case, the contents of the node node (heh) should actually be a CDATA section. So it should look like this:

<node><![CDATA[
     <% some code here! %>
]]></node>

If you do this, REXML will be able to properly parse the XML file, and return the contents of node, which will include the erb tags.

If you do not have control over the generation of the XML, you could, as a stop-gap fix, just (assuming that any given node that contains ERB only contains ERB) do a file wide search and replace for the start and end code tags, and appropriately append/prepend the CDATA markup. You could easily automate this in your language of choice, there's plenty of examples here on SO.

jason
thanks, I had forgotten about CDATA. I'll take it from here!
cloudhead