views:

76

answers:

3

I need a javascript regex that can distinguish between PHP tags in HTML tags and PHP tags outside of HTML tags.

e.g.

<input type="text" <? print '1'; ?> value="<? print '2'; ?>">

<? print '3';?>

So I need a regex to pull out:

<? print '1'; ?> and <? print '2'; ?>

And another regex to pull out:

<? print '3';?>

At the moment I have this regex which pulls out all PHP tags regardless of where they are:

/\n?<\?(php)?(\s|[^\s])*?\?>\n?/ig
+3  A: 

A). In a normal browser context Javascript won't be able to "see" the PHP at all. Where did you expect the document to be read from?

B). Regex is not a suitable tool for parsing HTML which is not a regular grammar. You have to use an XML/HTML parser.

annakata
+1 (and see also: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454)
Jacco
a) the text is in a string.
Adam Jimenez
b) I know it's dirty to do it this way - but it's also very quick and doesn't need to be perfect.
Adam Jimenez
-1 how is this helpful?
Adam Jimenez
@Adam Jimenez - *sigh* It's helpful because I'm telling you that your base assumptions are incorrect and that following them will inevitably lead to problems. It's really your call whether you want to go ahead hammering that screw in or not but if you don't care to listen, don't ask questions.
annakata
I listened but it wasn't helpful. You can't use a regular XML/HTML parser on a PHP doc as it isn't valid. Anyway don't worry about it I figured it out.
Adam Jimenez
No, you just think you have, but so it goes.
annakata
A: 

Do not use regex for HTML or XML tags. Instead, use parsing methods from XmlDocument and XmlElement and XmlAttribute.

Peet Brits
i'm not sure it will parse properly with php tags all over the place.
Adam Jimenez
A: 

This is a very complex thing to do, and I very much doubt it can be solved by regular expressions. This does depend to some degree on how complex the PHP that you want to extract is, but there are many cases to consider:

<?=max($a, $b);?>
<? echo max($a, $b); ?>
<?php echo ($a > $b) ? 'yes' : 'no'; ?>
<div><p><?php echo '</p>'; ?></div>

Why do you need to do this with JavaScript and regular expressions?

Dave
<?= and <? are deprecated.
halfdan
@halfdan That may be true in PHP.next (although http://php.net/manual/en/ini.core.php doesn't mention it), but that doesn't mean they won't turn up. I mentioned them in the interest of catering for all cases.
Dave
what about `<?php echo '?>'; ?>` ? Regular expressions are not the tool for the job.
Jacco
@halfdan PHP short-tags won't be enabled by default - doesn't mean they are deprecated.
Adam Jimenez
it should be ok to match those php tags because they all start and end with <? ?>. I know regex won't be perfect so there will be the odd case that slips through the net - which I can live with. I just need it to work with my example.
Adam Jimenez
the reason I'm doing this is so I can fix this bug: http://trac.xinha.org/ticket/1391
Adam Jimenez
I award this answer - as at least you seriously considered the question. The other wise-asses just dismissed me out of hand with their holier than thou "never use regex with HTML" bs. It's very difficult and slow to write a parser that fully handles PHP and HTML when a simple regex would work fine. I think the solution will need a two-step regex - one to check for PHP tags and another to check what follows it.
Adam Jimenez
Thanks! You're trying to solve a very awkward problem, and there is no easy solution.
Dave
Cheers Dave, I figured it out now.
Adam Jimenez