views:

439

answers:

3

demo:

$str = 'bcs >Hello >If see below!';
$repstr = preg_replace('/>[A-Z0-9].*?see below[^,\.<]*/','',$str);
echo $repstr;

What I want this tiny programme to output is "bcs >Hello ",but in fact it's only "bcs "

What's wrong with my pattern?

A: 

Why don't you write it like this:

$str = 'bcs >Hello >If see below!';
$repstr = preg_replace('/>If see below[^,\.<]*/','',$str);
echo $repstr;
Peter Stuifzand
because What I want is the first capitalized character or number after >
Shore
A: 

This might be a good alternative to what you have. The problem with your regexp is that instead of selecting what you want, you are selecting what you don't want and replacing that with an empty string. The best approach, in my opinion, is selecting what you want, that is what the code below does. What you end up with is what is what is matched by the first sub-pattern otherwise you get your string back.

$str = 'bcs >Hello >If see below!';
$repstr = preg_replace('/^([\w]+ >[\w]+).*?see below.*?$/i', '$1', $str);
var_dump($repstr);

I hope this helps.

partoa
Sorry,what I want to do is exactly replace:start from "first capitalized character or number after >"end with "see below[^,\.<]*"to empty.
Shore
+2  A: 

I think the problem is that you're misinterpreting how a non-greedy quantifier acts. Once it's in operation, yes, it stops earlier than it would otherwise. But it isn't aware of what comes before it (or potentially the text that comes later, either). It's only concerned with it's current position. Hence, the regular expression you posted will match all of:

">Hello >If see below!"

Let's see how this works:

/>[A-Z0-9].*?see below[^,\.<]*/

The regex first looks for ">" in "bcs >Hello >If see below!", and finds the first one, which is the one right before "Hello". Ok, let's check the next part of the expression:

[A-Z0-9]

The next char is a H, which matches the pattern [A-Z0-9]. Still good! Next:

.*?

Now we match all non-newline chars until we get to the first instance to match the remaining expressions of "see below[^,.<]*". If we had used just a plain greedy quantifier, we could match through multiple cases of "see below[^,.<]*" until we matched the last possible one. (So if your string had continued on, and there'd been other text match that pattern, it would have captured that as well) The non-greedy quantifier doesn't mean that your whole pattern will return the smallest possible match of all possible matches in the string. It just dictates how that particular character match functions.

You might want to try the following pattern then:

/>[A-Z0-9][^>]*?see below[^,\.<]*/

Hopefully this clears it up!

patjbs
Thank you for your reply,but this won't work for me.Because it won't work in this case:$str = 'bcs <>Hello <>If <br> see below!';I want to have 'bcs <>Hello <' after processing.
Shore
You might try elaborating the context of your question more then, and you might get some better answers.
patjbs
Here is the solution:$str = 'bcs <>Hello <>If <br> <br> see below!';$repstr = preg_replace('/>[A-Z0-9][^>]*(>[^A-Z0-9]*)*see below[^,\.<]*/','',$str);echo $repstr;Thank you for your attention on this problem:)
Shore