tags:

views:

505

answers:

6

EDIT: Can anyone help me out with a regular expression for a string such as this?:

[Header 1], [Head,er 2], Header 3

so that I can split this into chunks like:
[Header 1]
[Head,er 2]
Header 3

I have gotten as far as this:

(?<=,|^).*?(?=,|$)

Which will give me:
[Header 1]
[Head
,er 2]
Header 3

Thanks!!

+2  A: 

Variations of this question have been discussed before.

For instance:

Short answer: Regular Expressions are probably not the right tool for this. Write a proper parser. A FSM implementation is easy.

dmckee
+1  A: 

Isn't it as simple as this?

(?<=,|^)(?:[^,]|\[[^[]*\])*
jpalecek
+2  A: 
 (?<=,|^)\s*\[[^]]*\]\s*(?=,|$)

use the [ and ] delimiters to your advantage

rampion
+3  A: 
\[.*?\]

Forget the commas, you don't care about them. :)

JP Alioto
Good answer, but he changed the question on you...
dmckee
Well, now I'm confused. Does it really say Header or is that some placeholder? Are the brackets really there or optional? It has now become confusing exactly what the valid input strings are.
JP Alioto
Sorry about changing it, Valid input strings are [Some Text], Some More Text, [Yet mo,re Text] ...split into [Some Text] / Some more Text / [Yet mo,re Text]
Nate
+1  A: 

You could either use a regular expression to match the values inside the brackets:

\[[^\]*]\]

Or you use this regular expression to split the bracket list (using look-around assertions):

(?<=]|^)\s*,\s*(?=\[|$)
Gumbo
+1  A: 

In this case it's easier to split on the delimiters (commas) than to match the tokens (or chunks). Identifying the commas that are delimiters takes a relatively simple lookahead:

,(?=[^\]]*(?:\[|$))

Each time you find a comma, you do a lookahead for one of three things. If you find a closing square bracket first, the comma is inside a pair of brackets, so it's not a delimiter. If you find an opening bracket or the end of the line/string, it's a delimiter.

Alan Moore
Ah I see, I can replace the commas with another special char and split accurately using that. That'll work for me! Thanks!
Nate