tags:

views:

134

answers:

4

I have a script where I need to get three parts out of a text string, and return them in an array. After a couple of trying and failing I couldn't get it to work.

The text strings can look like this:

Some place
Some place (often text in parenthesis)
Some place (often text in parenthesis) [even text in brackets sometimes]

I need to split these strings into three:

{Some place} ({often text in parenthesis}) [{even text i brackets sometimes}]

Which should return:

1: Some place
2: often text in parenthesis
3: even text in brackets sometimes

I know this should be an easy task, but I couldn't solve the correct regular expression. This is to be used in PHP.

Thanks in advance!

+1  A: 

Split the problem into three regular expressions. After the first one, where you get every character before the first parenthesis, save your position - the same as the length of the string you just extracted.

Then in step two, do the same, but grab everything up to the closing parenthesis. (Nested parentheses make this a little more complicated but not too much.) Again, save a pointer to the end of the second string.

Getting the third string is then trivial.

alxp
+1  A: 

I'd probably do it as three regular expressions, starting with both parenthesis and brackets, and falling back to less items if that fails.

^(.*?)\s+\((.*?)\)\s+\[(.*?)\]\s+$

if it fails then try:

^(.*?)\s+\((.*?)\)\s+$

if that also fails try:

^\s+(.*?)\s+$

I'm sure they can be combined into one regular expression, but I wouldn't try.

Douglas Leeder
+2  A: 

Try something like this:

$result = preg_match('/
  ^ ([^(]+?)
  (\s* \( ([^)]++) \))?
  (\s* \[ ([^\]]++) \])?
  \s*
  $/x', $mystring, $matches);

print_r($matches);

Note that in this example, you will probably be most interested in $matches[1], $matches[3], and $matches[5].

thomasrutter
Worked like a charm! Thanks!
rebellion
+1  A: 

Something like this?

([^(]++)(?: \(([^)]++)\))?(?: \[([^\]]++)\])?
Peter Boughton