views:

63

answers:

1

Hi,

I'm trying to develop a regex that will detect recursive template calls in an xsl style sheet.

So far, it has not been really successful.

In the following code, I need to detect that template B is called recursively:

<xsl:template name="A"> 
     blah blha ?!@#?%$#^%?*?&(({}:"><;'[]\/.,./'
    <xsl:call-template name="B">  
    blah blah      
</xsl:template>    
<xsl:template name="B"> 
   blah blha 
    <xsl:call-template name="B">                
    blah blah  
</xsl:template>
<xsl:template name="C"> 
     blah blha 
    <xsl:call-template name="B">  
    blah blah  
</xsl:template>

In this specific case, the reg ex is ok.

In the case I remove the 2nd call to B, the regex matches the last call to B. That shouldn't happen.

(<xsl:template name=\"(?<templateName>\w+)\">.*?(?<=<xsl:call-template name=\"\k<templateName>\">).*?</xsl:template>)+

I'm no regex guru. Any help is welcome.

Thank you.

+8  A: 

Reiterating the obvious point: Don't try parsing XML or other non-regular languages with regular expressions. Please.

Use an XML parser and take a look at the resulting tree. You can create a digraph of the template calls and look for cycles in that. Should be a much more robust solution than trying to hack it together with regular expressions. That way you can also detect that template A might call template B which calls template C and which calls template A again. Such kinds of recursion would be invisible to your current approach (if it could be persuaded to work).

Joey
I agree, regex are not the perfect tool for that. But, I'm doing a little tool that will run once, maybe twice. I want to create something fast, can I can live with the fact that some calls like you describe won't be detected. I don't want to be involved in that project for long.Can you suggest me an xml parser tool that you think would fit?If it helps: I was planning on running this regex on multiple files with a script in powershell or iron python. thanks for the feedback.
Jean-Francois
Well, nearly every language has facilities for dealing with XML and simply walking the nodes probably doesn't require much effort. But I don't do this usually and thus am not really proficient in dealing with XML.
Joey
@Jean-Francois: PowerShell has extremely powerful XML support built in. See http://powershell.com/cs/blogs/ebook/archive/2009/03/30/chapter-14-xml.aspx
TrueWill
Can't stress this enough.
a paid nerd