views:

201

answers:

4

Hi folks,

i'm trying to port some Java stuff to C#. I'm just wondering if the following C# code is the equivalent to the original Java source.

Source: Java Code

private static final Pattern SIMPLE_IDENTIFIER_NAME_PATTERN = 
    Pattern.compile("^[a-zA-Z_][a-zA-Z0-9_]*$");

    private static boolean isValidIdentifier(String s) {
        Matcher m = SIMPLE_IDENTIFIER_NAME_PATTERN.matcher(s);
        return (m.matches() && !reserved.contains(s));
    }

Destination: C# Code

private static readonly Regex SIMPLE_IDENTIFIER_NAME_PATTERN = 
    new Regex("^[a-zA-Z_][a-zA-Z0-9_]*$", RegexOptions.Compiled);

private static bool IsValidIdentifier(string s)
{
    Match match = SIMPLE_IDENTIFIER_NAME_PATTERN.Match(s);
    return (match.Success && !Reserved.Contains(s));
}

Cheers :)

+1  A: 

Looks good, but why don't you start by porting your unit tests?

Aaron Maenpaa
because i don't have (nor want) java installed AND i don't know what to expect - i'm flying blind.
Pure.Krome
I'm sorry but that seems kind of silly. If you're really afraid of some kind of contamination put it in a VM and write some characterization tests (http://en.wikipedia.org/wiki/Characterization_Test).
Aaron Maenpaa
+3  A: 

As per my comment, I think you should write a Unit test (or tests) to verify the port works as expected.

Mitch Wheat
A: 

Beware that a readonly type is not immutable. That means that you cannot change which reference of regex you point to, but you can change the regex object itself. (Luckily the contract on the regex won't let you change the expression though)

Beware that .Net regex syntax is not the same as *nix regex syntax, so you may get bitten there. Confirm as per the MSDN docs what you need your string to do:

MSDN Regex Syntax

Spence
Confirm? that's why i'm asking this question :) I believe it's pretty much an exact copy .. but I'm unsure cause i never use REGEX and Java.
Pure.Krome
+1  A: 

Your use of the Caret and Dollar indicates that you want to match embedded newlines in the subject string, as opposed to the beginning and end of the entire string. If so, then you should definitely set the RegexOptions.Multiline option for your Regex. If you do not set that option, your Caret and Dollar will have no special implication.

private static readonly Regex SIMPLE_IDENTIFIER_NAME_PATTERN = new Regex("^[a-zA-Z_][a-zA-Z0-9_]*$", RegexOptions.Compiled | RegexOptions.Multiline);

It may also be worthwhile to evaluate the need for compiling this Regex. Does it need to be used repeatedly (such as, in a loop) ? If not, then your Regex will in fact have lower performance.

Besides this point, your conversion appears to be valid. As some of the others have suggested, the only way to be reasonably sure is to unit test it.

Cerebrus