tags:

views:

80

answers:

4

I have a text file like so

{word} definition
{another word} another definition {word} they don't have to be on different lines

The regex I was using was

    $regex = '/\{([a-z]+?)\}(.+?)\{/i';

This however, causes problems in that it swallows the last brace {, and then it won't match the next { in the next word.

To demonstrate, I did this for debugging purposes

echo preg_replace($regex, '<b style="background-color: red;">$1</b><b style="background-color: yellow;">$2</b>', $content);

Here is an example of my output (notice the opening brace in the next word isn't there, therefore it is not matched in the regex)

<b style="background-color: red;">shrub</b><b style="background-color: yellow;"> Multi stemmed woody plant</b>Abaxial}    side or face away from the axis

How can I amend my regex to get this to work? Thank you

EDIT

Many thanks for your answers. I've changed my regex like so

$regex = '/\{([a-z\-\s]+?)\}([^\{]+)/i';

I'll also look into the lookahead articles.

+2  A: 

You'll want to use the Look Ahead feature to find a character, without capturing it.

You could restructure your regex as so.

$regex = '/\{([a-z]+?)\}(.+?)(?={)';
Kibbee
+7  A: 

For this specific case, you could do:

$regex = '/\{([a-z]+?)\}([^\{]+)/i';

[^\{] means "match any character that is not a left brace". This also has the advantage of not requiring a { at the end of your input.

More generally, you can also use lookahead assertions as others have mentioned.

Miles
+1  A: 

Use a positive lookahead assertion.

JP Alioto
+2  A: 

You could change the last part to match only non-curly brace characters instead of .+ followed by a curly brace, like so:

$regex = '/\{([a-z]+?)\}([^{]+)/i';
John Kugelman