tags:

views:

118

answers:

6

I have user inputs such as these

paul vs Team Apple Orange
Team Apple Orange vs paul
Team Apple Orange v.s. paul

I need to write a regular expression that detects the words on both sides of the seperator (vs,vs.,v.s.) and store the side with the keyword "team" to variable team and the other to name.

name = "paul"
team = "Apple Orange"
+5  A: 

Try this really crude program:

string[] tests = new string[] {
  "paul vs Team Apple Orange",
  "Team Apple Orange vs paul",
  "Team Apple Orange v.s. paul"
};

foreach (string line in tests)
{
  string pattern = "(?:Team )?(.*?)\\s+(?:vs|v\\.s\\.)\\s+(?:Team )?(.*)";
  Regex regex = new Regex(pattern);
  Match match = regex.Match(line);
  Console.WriteLine(line);
  if (match.Success)
  {
    string team1 = match.Groups[1].Value;
    string team2 = match.Groups[2].Value;
    Console.WriteLine("Team 1 : " + team1);
    Console.WriteLine("Team 2 : " + team2);
  }
  else
  {
    Console.WriteLine("No match found");
  }
  Console.WriteLine();
}
Console.ReadLine();

Output:

paul vs Team Apple Orange
Team 1 : paul
Team 2 : Apple Orange

Team Apple Orange vs paul
Team 1 : Apple Orange
Team 2 : paul

Team Apple Orange v.s. paul
Team 1 : Apple Orange
Team 2 : paul

Edit: if you want to allow "vs." and "v.s" to correctly match just change the expression to:

string pattern = "(?:Team )?(.*?)\\s+(?:v\\.?s\\.?)\\s+(?:Team )?(.*)";

The first version will only correctly match on "vs" or "v.s.".

cletus
Is there any benefit to doing `(?:vs|v\.?s\.?)` over just `v\.?s\.?` ?
Peter Boughton
It depends on how strict you want or need to be. Do you want to match "vs." and "v.s"?
cletus
yes both.......
newbie
A: 

What have you tried so far? Are you doing this within (for example) a perl script?

azp74
+3  A: 

This sounds like a two step procedure... first extract the left and right sides, then test them to determine which side contains the "team" keyword.

The regex would be something like this:

Regex.Match(input, "(.+)\s+v.?s.?\s+(.+)", RegexOptions.IgnoreCase)

The left and right sides would be in group 1 and 2 of the regex match

Jordan Liggitt
+2  A: 

based on what your examples... This works:

(?<Team>Team[\w\s]+)\s(?:vs|v\.s\.|vs\.)\s(?<Name>[\w]+)|(?<Name>[\w]+)\s(?:vs|v\.s\.|vs\.)\s(?<Team>Team[\w\s]+)

Edit: My example will only allow alpha numeric characters, so it all depends on what you need.

J.13.L
A: 

cletus' answer is correct, but you can't tell the which group is the name, and which group is the team. Using the simpler

/(.+)\s+(?:vs|v|v\.s\.)\s+(.+)/

then you can inspect $1 and $2 for "Team", and strip it off to get the team name. Or use

/(?:(team\s+)?(.+))\s+(?:vs|v|v\.s\.)\s+(?:(team\s+)?(.+))/

then if $1 == "Team", then $2 is the team and $4 is the name or if $1 is undefined, then $2 is the name ($3 == "Team") and $4 is the team

This is javascript, and not c#, but it demonstrates:

  var m = "team paul vs apples oranges".match(/(?:(team\s+)?(.+))\s+(?:vs|v|v\.s\.)\s+(?:(team\s+)?(.+))/);
    for(var i in m) {
       console.log(i + ": " + m[i]);
    }
Ed
A: 

This code will distinguish between the team and the name, allowing you to simple pick it up out of the regular expression match information.

Regex test = new Regex(@"(?i)^(?:(?:Team\s+(?<team>.*?))|(?<name>.*?))(?:\s+(?<vs>v\.?s\.?)\s+)(?:(?:Team\s+(?<team>.*?))|(?<name>.*?))$");
foreach (string input in ...)
{
  Match match = test.Match(input);
  if (match.Success) 
  {
    string team = match.Groups["team"].Value;
    string name = match.Groups["name"].Value;
  }
}
John Fisher