tags:

views:

749

answers:

5

Duplicate

http://stackoverflow.com/questions/585853/regex-for-variable-declaration-and-initialization-in-c

I was looking for a Regular Expression to parse CSV values, and I came across this Regular Expression

[^,]+

Which does my work by splitting the words on every occurance of a ",". What i want to know is say I have the string

value_name v1,v2,v3,v4,...

Now I want a regular expression to find me the words v1,v2,v3,v4..

I tried ->

^value_name\s+([^,]+)*

But it didn't work for me. Can you tell me what I am doing wrong? I remember working on regular expressions and their statemachine implementation. Doesn't it work in the same way.

If a string starts with Value_name followed by one or more whitespaces. Go to Next State. In That State read a word until a "," comes. Then do it again! And each word will be grouped!

Am i wrong in understanding it?

+3  A: 

I would expect it only to get v1 in the group, because the first comma is "blocking" it from grabbing the rest of the fields. How you handle this is going to depend on the methods you use on the regular expression, but it may make sense to make two passes, first grab all the fields seperated by commas and then break things up on spaces. Perhaps ^value_name\s+(?:([^,]+),?)* instead.

Logan Capaldo
A: 

This is a homework question, isn't it? You almost got me!

Paul Keister
+2  A: 

Oh yeah, lists....

/(?:^value_name\s+|,\s*)([^,]+)/g will theoreticly grab them, but you will have to use RegExp.exec() in a loop to get the capture, rather than the whole match.

I wish pre-matches worked in JS :(.

Otherwise, go with Logan's idea: /^value_name\s+([^,]+(?:,\s*[^,]+)*)$/ followed by .split(/,\s*/);

Simon Buchan
+6  A: 

You could use a Regex similar to those proposed:

(?:^value_name\s+)?([^,]+)(?:\s*,\s*)?
  • The first group is non-capturing and would match the start of the line and the value_name.
    To ensure that the Regex is still valid over all matches, we make that group optional by using the '?' modified (meaning match at most once).

  • The second group is capturing and would match your vXX data.

  • The third group is non-capturing and would match the ,, and any whitespace before and after it.
    Again, we make it optional by using the '?' modifier, otherwise the last 'vXX' group would not match unless we ended the string with a final ','.

In you trials, the Regex wouldn't match multiple times: you have to remember that if you want a Regex to match multiple occurrences in a strings, the whole Regex needs to match every single occurrence in the string, so you have to build your Regex not only to match the start of the string 'value_name', but also match every occurrence of 'vXX' in it.

In C#, you could list all matches and groups using code like this:

Regex r = new Regex(@"(?:^value_name\s+)?([^,]+)(?:\s*,\s*)?");
Match m = r.Match(subjectString);
while (m.Success) {
 for (int i = 1; i < m.Groups.Count; i++) {
  Group g = m.Groups[i];
  if (g.Success) {
   // matched text: g.Value
   // match start: g.Index
   // match length: g.Length
  } 
 }
 m = m.NextMatch();
}
Renaud Bompuis
C#'s Regex looks pretty well thought out...
Simon Buchan
A: 

thanks very much it realy helped me out !!!!!!!!!!!!!

humail