tags:

views:

256

answers:

4

I have three sentences as follows:

000000-00000 Date First text: something1 
200000-00000 Time Second text: something2
234222-34332 struc Third text: somthing3

How do I write a regex to match between (Date|Time|struc) and the colon (:), not including (Date|Time|struc)?.

A: 

The following expression will capture what you want into the named group value excluding Date, Time, struc, the following space, and the colon following the value.

(?:Date|Time|struc) (?<value>[^:]*)

This expression will include the colon.

(?:Date|Time|struc) (?<value>[^:]*:)
Daniel Brückner
+2  A: 

I suspect this is what you're after. The regex part is:

new Regex(@"^\d{6}-\d{5} \w* ([^:]*): ")

And here's a short but complete test program:

using System;
using System.Text.RegularExpressions;

class Test
{   
    static void Main(string[] args)
    {
        Parse("000000-00000 Date First text: something1");
        Parse("200000-00000 Time Second text: something2");
        Parse("234222-34332 struc Third text: somthing3");
    }

    static readonly Regex Pattern = new Regex
        (@"^\d{6}-\d{5} \w* ([^:]*): ");

    static void Parse(string text)
    {
        Console.WriteLine("Input: {0}", text);
        Match match = Pattern.Match(text);
        if (!match.Success)
        {
            Console.WriteLine("No match!");
        }
        else
        {
            Console.WriteLine("Middle bit: {0}", match.Groups[1]);
        }
    }
}

Note that this doesn't assume "Date", "Time" "struc" are the only possible values after the digits, just that they'll be constructed from word characters. It also assumes you want to match against the whole line, not just the middle part. It's easy to extract the other sections with other groups if that would be helpful to you.

Jon Skeet
Admit it. You had that answer prepared. :-)
Tomalak
I have all the answers prepared. It takes a long time to find the right one though ;)
Jon Skeet
A: 

This:

^\d{6}-\d{5} \S+ ([^:]+)

Would match "First text", "Second text" and "Third text", without explicitly referring to (Date|Time|struc). The match is in group 1.

Tomalak
A: 

If from you're example you're expecting the output to be:

First text Second text Third text

You would use the regular expression

(?<=(DATE|TIME|STRUC)\s)[^:]*

I can't imagine looking at your example that would be extremely useful though - it looks like the descriptive text is after the colon which would imply that you really want everything to the end of the line which would be:

(?i:(?<=(DATE|TIME|STRUC)\s).*)

[Checked using RegexBuddy - so if I interpreted your question correctly, this works]

BenAlabaster