tags:

views:

119

answers:

1

I know that there are many ASN.1 parser out there but they cost quite a lot and as such, I am trying to write my own.

I am kind of new to Regular Expression so in order to extract the text for the placeholders A, B, C and D, what should the Regular Expression be in C#?

A ::= B
{
    C1 D1,
    C2 D2,
    C3 D3
}

where A, C and D can be any valid word which consists of any combination of the following

  • A-Z
  • a-z
  • 0-9
  • _

And B can be any ASN.1 types such as "SEQUENCE", "SEQUENCE OF", "CHOICE", "UTF8String", etc. A full list can be found in "Universal Class Tags" table at this link.

+1  A: 

You mean you want to match that whole construct with one regex? That's a bad idea. Regexes can be useful as a component of a parser, but it's best to keep their role to a minimum. Don't try to match large chunks of text, especially recursive or looping structures. C# regexes are powerful enough to handle such things in many cases, but not all--and that's way beyond beginner level anyway.

I suggest you try it without using regexes at all. Otherwise you'll constantly be distracting yourself, wondering how the regex technique you haven't learned yet would make the current task easier, or solve the problem more elegantly (if you'll pardon my language). Concentrate on writing solid, readable, maintainable code--that being another weakness of regexes.

Alan Moore