views:

56

answers:

3

hi there,

I'm trying to get the text between the heading tag using the following php script:

$search_string= < h1 >testing here< /h1 >;

$text = preg_match('<%TAG%[^>]*>(.*?)</%TAG%>',$search_string, $matches);

echo $matches[0]; 

When i try to run this script there is no value being returned. Instead there is warning message: Warning: preg_match() [function.preg-match]: Unknown modifier '(' in C:\xampp\htdocs\check_for_files.php on line 10

Can anyone help with this please?

Thanks

+2  A: 

Your expression needs delimiters. / is the most common, but # should work for this situation.

$text = preg_match('#<%TAG%[^>]*>(.*?)</%TAG%>#',$search_string, $matches);
Blackcoat
+2  A: 

The warning is because you've not enclosed your regex in delimiters. So try

$text = preg_match('#<%TAG%[^>]*>(.*?)</%TAG%>#',$search_string, $matches);

Understanding the warning.

Consider your regex:

'<%TAG%[^>]*>(.*?)</%TAG%>'
 ^          ^
start      end 

Since you've not explicitly put the regex between delimiter, PHP thinks you are using < and > as delimiter as < is the first char in the regex. Hence when it sees an un-escaped < it takes it as end of pattern. Next we can have few modifiers after the closing delimiter which allow us to alter the behavior of the pattern matching. Some commmon modifiers are:

  • i for case insensitive
  • m for multi line match

Now in your case there is a ( after the closing delimiter which is not a valid modifier, hence the warning.

codaddict
+1  A: 

/^<[^>]+>(.*)<\/[^>]+>$/ should do the trick.

mway
hi, I'm very interested in this approach. Could you please explain this? Thank you.
Pavan
It's a pretty basic expression; `<[^>]+>` means 'one or more of any character except `>` enclosed within `<>`; `(.*)` matches anything; and `<\/[^>]+>` is similar to the first in that it means 'one or more of any character except `>` enclosed within `</>`. The first and the last are structured this way so you don't have to write complex rules to match what might possibly be in the tag (attributes, etc); we assume `>` will not be in it (because that's not valid in class names or element ids, for example). Not the most efficient expression, but gets the job done.
mway
Also: there are parenthesis around `.*` (eg, `(.*)`) so that that group is returned as a specific match within the results.
mway