tags:

views:

77

answers:

3

I have generated the following regular expression in a project I am working on, and it works fine, but out of professional curiosity I was wondering If it can be "compressed/shortened":

  /[(]PRD[)].+;.+;.*;.+;.+;.*;.*;.*;/

Regexes have always seemed like voodoo to me...

+3  A: 

For starters, the single-character blocks can just go away:

/\(PRD\).+;.+;.*;.+;.+;.*;.*;.*;/

Next, you can group the related items together:

/\(PRD\)(.+;){2}.*;(.+;){2}(.*;){3}/

This actually makes it textually longer, though.

John Feminella
yes, its longer but 50% more awesome. ;P
Ash
And since the '.' should probably be '[^;]' each time, the rewrite then becomes a whole lot shorter - like 15 characters shorter than the amended original.
Jonathan Leffler
There could also be an argument that the '(...)' sequences should be '(?:...)' (non-capturing groupings) since the original did not contain any captures. There's another argument that the original probably wants the groupings, but then there'd be a question of whether it is better to split on ';'.
Jonathan Leffler
+2  A: 
/\(PRD\).+;.+;.*;.+;.+;(.*;){3}/

I don't think you will gain much and arrive at the same exact rules. If you didn't care to make all the text between the ";" optional, then you could:

/\(PRD\)(.*;){8}/
eglasius
+3  A: 
/\(PRD\)(.+;.+;.*;){2}(.*;){2}/

is shorter than

/\(PRD\)((.+;){2}.*;){2}(.*;){2}/

but arguably less awesome. Both are successfully shorter than

/[(]PRD[)].+;.+;.*;.+;.+;.*;.*;.*;/

though (if only because of the character class change).

Or you could even go with

/\(PRD\)(.+;.+;.*;){2}.*;.*;/

which may be the shortest you can get with the same rules.

Cebjyre
+1 shortest with the same rules - now, if you bend the rules a bit ... :)
eglasius