views:

84

answers:

4

Hello,

Can anyone help me make a regex or give me a good solution that can split/check the following string:

"<2342Flsdn3Z><9124Fsflj20>"

Everything starts with a "<" and the 6 caracter is a "F" and the string ends with a ">" Is it possible to make a Regex that can find "strings" like this?

A: 

easy:

<\d{4}F\w+>

Or, to just get the strings:

(?<=<)\d{4}F\w+(?=>)
John Gietzen
+5  A: 

How about this: <.{4}F[^>]+>

It matches the opening <, followed by any 4 chars, F, then anything till the closing > (by matching anything that is not a >).

string input = "<2342Flsdn3Z><9124Fsflj20>";
string pattern = "<.{4}F[^>]+>";
foreach (Match m in Regex.Matches(input, pattern))
{
    Console.WriteLine(m.Value);
}

EDIT: part of making a good regex is clearly specifying the pattern you want to match. For example, the way you worded the question leaves certain details out. I responded with my pattern to match any character as long as F was where you specified.

For a better regex you could've told us a number of things:

  • Chars before F will always be digits and of length 4: \d{4} or [0-9]{4}
  • Chars after F will be of X length (6?) and can only be numbers and letters: [\dA-Z]{6}
  • Case is insensitive: use RegexOptions.IgnoreCase (.NET) or use [a-zA-Z]
  • State your intention: are you matching it? Trying to extract the inner value? What do you mean by split? Split on what?
  • Specify the language you're using: C#, Python, Perl, etc. (you did this one)
Ahmad Mageed
This will match punctuation as well as numbers and letters -- that may be ok, but you should be aware of it.
tvanfosson
@tvanfosson agreed, I kept it open since the OP wasn't specific :)
Ahmad Mageed
A: 

I'm making some assumptions that everything inside the brackets needs to be a word character and that there is at least one, but perhaps an arbitrary number of word characters before the trailing bracket.

var regex = new Regex( "<\w{4}F\w+>" );
tvanfosson
+1  A: 

Yes. <[A-Za-z\d]{4}F[A-Za-z\d]{6}>

< followed by Any 4 letters or digits followed by F followed by any 6 letters or digits follow by >

I made the assumption its always six after F. You can modify the repetition to suit your needs.

Original proposed solution to conserve valid comment so others can learn from my mistake: [\d\w]{4}F[\d\w]{6}>

Derek Litz
\w will match either letters or digits as well as underscores.
tvanfosson
Thank you very much for the correction.
Derek Litz