tags:

views:

93

answers:

2

Hey all,

I'm trying to use named groups to parse a string.

An example input is:

exitcode: 0; session id is RDP-Tcp#2

and my attempted regex is:

("(exitCode)+(\s)*:(\s)*(?<exitCode>[^;]+)(\s)*;(\s)*(session id is)(\s)*(?<sessionID>[^;]*)(\s)*");

Where is my syntax wrong?

Thanks

+2  A: 

In your example:

exitcode: 0; session id is RDP-Tcp#2

It does not end with a semi-colon, but it seems your regular expression expects a semi-colon to mark the end of sessionID:

(?<sessionID>[^;]*)

I notice that immediately following both your named groups, you have optional whitespace matches -- perhaps it would help to add whitespace into the character classes, like this:

(?<exitCode>[^;\s]+)
(?<sessionID>[^;\s]*)

Even better, split the string on the semi-colon first, and then perhaps you don't even need a regular expression. You'd have these two substrings after you split on the semi-colon, and the exitcode and sessionID happen to be on the ends of the strings, making it easy to parse them any number of ways:

exitcode: 0
session id is RDP-Tcp#2
Richard Walters
A: 

Richard's answer really covers it already - either remove or make optional the semicolon at the end and it should work, and definitely consider putting whitespace in the negated classes or just splitting on semi-colon, but a little extra food for thought. :)


Don't bother with \s where it's not necessary - looks like your output is some form of log or something, so it should be more predictable, and if so something simpler can do:

exitcode: (?<exitCode>\d+);\s+session id is\s+(?<sessionID>[^;\s]*);? 


For the splitting on semi-colon, you'll get an array of two objects - here's some pseudo-code, assuming exitcode is numeric and sessionid doesn't have spaces in:

splitresult = input.split('\s*;\s*')
exitCode  = splitresult[0].match('\d+')
sessionId = splitresult[1].match('\S*$')

Depending on who will be maintaining the code, this might be considered more readable than the above expression.

Peter Boughton