tags:

views:

195

answers:

3

I'm working on a new Java project and therefore im reading the already existing code. On a very important part of the code if found the following regex expression and i can't really tell what they are doing. Anybody can explain in plain english what they do??

1)

 [^,]*|.+(,).+

2)

(\()?\d+(?(1)\))
+10  A: 

1)

[^,]* means any number of characters that are not a comma
.+(,).+ means 1 or more characters followed by a comma followed by 1 or more characters
|  means either the first one or the second one

2)

(\()? means zero or one '('  note* backslash is to escape '('
\d+ means 1 or more digits
(?(1)\)) means if back-reference \1 matched, then ')' note* no else is given

Also note that parenthesis are used to capture certain parts of the regular expression, except, of course, if they are escaped with a backslash

Silmaril89
`(?` is if-then-else conditionals, which isn't supported by Java. See my answer.
polygenelubricants
A summary of the whole regular expression in English would also be useful.
earlNameless
A: 

1) Anything that doesn't starts with a comma, or anything that contains a comma in between.

2) Any number that ends with a 1, and is between parenthesis, possible closed before and opened again after the number.

eKek0
Sorry, but that's completely wrong.
Alan Moore
@AlanMoore: What is this different from polygenelubricants conclusion, besides than this is not an tool-generated response?
eKek0
1) `[^,]*` is **not** "Anything that doesn't starts with a comma", which would be `[^,].*` (except for newline characters). Instead `[^,]*` is "anything not containing a comma". 2) As noted in other answers, the `(?(1)...)` part is a conditional expression (missing in Java), and not a grouped `1` character. See http://www.regular-expressions.info/conditional.html
Christian Semrau
@poly's answer also includes a thorough analysis of both regexes, which is very different from yours. In particular, the first regex *will* match a string that starts with a comma if there's at least one more comma. I suspect its purpose is to determine whether a string represents a single item or a comma-separated list of items.
Alan Moore
+6  A: 
polygenelubricants
wow didn't know about that site. Looks great. How 'complex' can the program handle?
ggfan
@ggfan: it's just mechanical translation. It's probably not that hard to cover all syntax. To machine, regex isn't all that "confusing".
polygenelubricants
+1, learnt something new.
BalusC
@BalusC: Everyday =)
polygenelubricants
If you're find yourself doing a lot of this kind of thing, you might want to invest in RegexBuddy. It will tell you how (and whether) a regex works *in your language/flavor of choice*, not just in Perl (which supports a lot more features that Java doesn't). http://www.regexbuddy.com/
Alan Moore