tags:

views:

568

answers:

6

I would like to match these lines:

ParameterINeed: 758
ParameterCount: 8695
ParameterText: 56

And I would receive a parameter name and parameter value. Could you please tell me how to write Regex.Matches patter for this and how to process this data into Dictionary?

I use this code:

string Text = "ParameterINeed: 758\r\nParameterCount: 8695\r\nParameterText: 56";
string Pattern = "^(\\w+):\\s+(\\d+)$";
MatchCollection ma = Regex.Matches(Text, Pattern, RegexOptions.Singleline);

And get ma.Count = 0

A: 

Depends on some other constaints but you could just use this.

var regex = new Regex(@"^($<parameter>\w+):\s($<value>\d+)$");
var dictionary = new Dictionary<string, string>();

foreach(var match in regex.Matches(inputData, RegexOptions.Multiline))
{
  dictionary.Add(
    match.Groups["parameter"].Value, 
    match.Groups["value"].Value);
}
bendewey
It's not working. Dictionary size is 0
tomaszs
I can't test it at the moment, try playing with the RegexOptions
bendewey
Please provide a comment when down-voting.
bendewey
+3  A: 

Try this regex

"^Parameter(\w+):\s+(\d+)$"

You can then acces the name via Matches[1] and the value as Matches[2]. My answer is based on the idea that for the string ParameterINeed: 42 you want

  • Name: INeed
  • Value: 42

If instead you wanted ParameterINeed for the value, you could just remove the Parameter word from the regex.

"^(\w+):\s+(\d+)$"

EDIT Responding to added code sample

Try the following sample instead

string Text = "ParameterINeed: 758\r\nParameterCount: 8695\r\nParameterText: 56";
string[] lines = Text.Split("\n");
string Pattern = @"^(\w+):\s+(\\d+)$";
foreach ( string line in lines ) {
  MatchCollection ma = Regex.Matches(line, Pattern, RegexOptions.Singleline);
}
JaredPar
When I do it this way it matches only last parameter in text.
tomaszs
Oh, and is there a way to do this without foreach loop?
tomaszs
@tomaszs, generally speaking it's harder to get a regex to play around with a new line. You can use one but it makes it harder to reason about the values within the match collection. Is there a reason you don't want to process it in a loop?
JaredPar
@JaredPar - I would like to get rid of any kind of splits and loops so that it will be pure, independent Regex. And with this split it's like building a metal bridge with one part of it made of rope.
tomaszs
Only if the Regex is the rope.
Joel Coehoorn
get rid of the splits and use RegexOptions.Multiline
Cipher
Also you have an extra backslash infront of the \d... should be @"^(\w+):\s+(\d+)$";
Cipher
A: 

Why don't you simply split by lines, then the text by ':' and trim the results? Or is it more complex issue?

Miha Markic
I would like to do this with Regex for future changes
tomaszs
I see. Then it has to be regex.
Miha Markic
A: 

Not having used C# I can't give a direct code sample.

If it follows normal regex patterns though, the problem might be the ^/$.. normally that matches start of string (^) and end of string ($) not necessarily "end of line".

What if you try something like (tested with perl):

/(\w+):\s(\w+)(?:\r\n)?/g

shelfoo
A: 

Here is a tested solution. ;)

static void Main(string[] args)
{
 try
 {

  string sInput;

  // The string to search.
  sInput = "ParameterINeed: 758\r\nParameterCount: 8695\r\nParameterText: 56";


  var regex = new Regex(@"(?<parameter>\w+):\s+(?<value>\d+)");
  var dictionary = new Dictionary<string, string>();

  foreach (Match match in regex.Matches(sInput))
  {
   dictionary.Add(
   match.Groups["parameter"].Value, 
   match.Groups["value"].Value); 
  }

  foreach (KeyValuePair<string, string> item in dictionary)
   Console.WriteLine("key: {0}; value:{1}", item.Key, item.Value);

 }
 finally
 {
  Console.ReadKey();
 }

}
AnthonyWJones
+2  A: 

The RegexOptions.SingleLine only affects how the period token works, not how the ^ and $ tokens work. You need to use RegexOptions.MultiLine for that.

The multiline mode doesn't understand the \r\n line breaks, it only considers the \n character as line break. You have to consume the \r character to get to the line break.

string text = "ParameterINeed: 758\r\nParameterCount: 8695\r\nParameterText: 56";
string pattern = @"^(\w+):\s+(\d+)\r?$";
MatchCollection ma = Regex.Matches(text, pattern, RegexOptions.Multiline);

Now ma.Count is 3.

This is how you put the matches in a dictionary:

Dictionary<string, int> values = new Dictionary<string, int>();
foreach (Match match in ma) {
 values.Add(match.Groups[1].Value, int.Parse(match.Groups[2].Value));
}
Guffa
SingleLine causes $ to match the end of the string. Where as Multiline causes $ to match the first new line.
AnthonyWJones
@AnthonyWJones: No, SingleLine doesn't change how $ works at all, it only changes how the period works. Multiline causes $ to match the end of each line, not the new line character.
Guffa
This is the correct answer.
Cipher