




I have user inputs such as these

paul vs Team Apple Orange
Team Apple Orange vs paul
Team Apple Orange v.s. paul

I need to write a regular expression that detects the words on both sides of the seperator (vs,vs.,v.s.) and store the side with the keyword "team" to variable team and the other to name.

name = "paul"
team = "Apple Orange"
+5  A: 

Try this really crude program:

string[] tests = new string[] {
  "paul vs Team Apple Orange",
  "Team Apple Orange vs paul",
  "Team Apple Orange v.s. paul"

foreach (string line in tests)
  string pattern = "(?:Team )?(.*?)\\s+(?:vs|v\\.s\\.)\\s+(?:Team )?(.*)";
  Regex regex = new Regex(pattern);
  Match match = regex.Match(line);
  if (match.Success)
    string team1 = match.Groups[1].Value;
    string team2 = match.Groups[2].Value;
    Console.WriteLine("Team 1 : " + team1);
    Console.WriteLine("Team 2 : " + team2);
    Console.WriteLine("No match found");


paul vs Team Apple Orange
Team 1 : paul
Team 2 : Apple Orange

Team Apple Orange vs paul
Team 1 : Apple Orange
Team 2 : paul

Team Apple Orange v.s. paul
Team 1 : Apple Orange
Team 2 : paul

Edit: if you want to allow "vs." and "v.s" to correctly match just change the expression to:

string pattern = "(?:Team )?(.*?)\\s+(?:v\\.?s\\.?)\\s+(?:Team )?(.*)";

The first version will only correctly match on "vs" or "v.s.".

Is there any benefit to doing `(?:vs|v\.?s\.?)` over just `v\.?s\.?` ?
Peter Boughton
It depends on how strict you want or need to be. Do you want to match "vs." and "v.s"?
yes both.......

What have you tried so far? Are you doing this within (for example) a perl script?

+3  A: 

This sounds like a two step procedure... first extract the left and right sides, then test them to determine which side contains the "team" keyword.

The regex would be something like this:

Regex.Match(input, "(.+)\s+v.?s.?\s+(.+)", RegexOptions.IgnoreCase)

The left and right sides would be in group 1 and 2 of the regex match

Jordan Liggitt
+2  A: 

based on what your examples... This works:


Edit: My example will only allow alpha numeric characters, so it all depends on what you need.


cletus' answer is correct, but you can't tell the which group is the name, and which group is the team. Using the simpler


then you can inspect $1 and $2 for "Team", and strip it off to get the team name. Or use


then if $1 == "Team", then $2 is the team and $4 is the name or if $1 is undefined, then $2 is the name ($3 == "Team") and $4 is the team

This is javascript, and not c#, but it demonstrates:

  var m = "team paul vs apples oranges".match(/(?:(team\s+)?(.+))\s+(?:vs|v|v\.s\.)\s+(?:(team\s+)?(.+))/);
    for(var i in m) {
       console.log(i + ": " + m[i]);

This code will distinguish between the team and the name, allowing you to simple pick it up out of the regular expression match information.

Regex test = new Regex(@"(?i)^(?:(?:Team\s+(?<team>.*?))|(?<name>.*?))(?:\s+(?<vs>v\.?s\.?)\s+)(?:(?:Team\s+(?<team>.*?))|(?<name>.*?))$");
foreach (string input in ...)
  Match match = test.Match(input);
  if (match.Success) 
    string team = match.Groups["team"].Value;
    string name = match.Groups["name"].Value;
John Fisher