views:

50

answers:

2
2010-June-11
<remove>2010-June-2</remove>
<remove>2010-June-3</remove>
2010-June-15
2010-June-16
2010-June-17
2010-June-3
2010-June-2
2010-June-1

I'm trying to find all instances that are between the <remove> tags

This is what I have:

$pattern = "/<remove>(.*?)<\/remove>/";
preg_match_all($pattern, $_POST['exclude'], $matches);

foreach($matches as $deselect){
    foreach ($deselect as $display){
        echo $display."<br />";
    }
}

This is what it returns:

2010-June-2
2010-June-3
2010-June-2
2010-June-3

Why is it doubling up, and how do I prevent that?

A: 

Not a regex solution but you may remove duplicated like this:

array_unique($matches);
Sarfraz
+2  A: 

Don't use regex to parse xml/html...

With that said, the problem is because the match structure looks like:

array(
    0 => array('whole match1', 'whole match 2', 'whole match 3'),
    1 => array('subpattern match 1', 'subpattern match 2', 'subpattern match 3'),
);

So instead of doing your foreach, do a:

if (!empty($matches)) { 
    foreach ($matches[1] as $value) {
        echo $value;
    }
}

or use the PREG_SET_ORDER flag to preg_match_all, which will result in an array structure like:

array( 
    0 => array('whole match1', 'subpattern match 1'),
    0 => array('whole match2', 'subpattern match 2'),
    0 => array('whole match3', 'subpattern match 3'),
);

So then your foreach would become:

if (!empty($matches)) { 
    foreach ($matches as $match) {
        echo $match[1];
    }
}
ircmaxell
Beat me to it, recycling my answer. But, couldn't you just remove the capturing group (parenthesis) from the pattern? `$pattern = "/<remove>.*?<\/remove>/";`
Stephen P
No, because he said he wants the text between the tags (hence why you need the capturing subpattern). If he just wanted the tags (and contents) then you are correct that there's no need for the subpattern. (or at least that's how I understood the question)...
ircmaxell
@ircmaxell: That worked! Ta!
kylex
There is no *need* for a capturing subpattern at all.
salathe
True, you could use assertions or non-capturing subpatterns... So a better way of putting it would be that you need a subpattern (since assertions are a form of subpattern)... Fair?
ircmaxell