tags:

views:

2189

answers:

4

I'm using C# and I want to check if a string contains one of ten characters, *, &, # etc etc. What is the best way?

+2  A: 

string.IndexOfAny(...)

Jason Williams
+19  A: 

The following would be the simplest method, in my view:

var match = str.IndexOfAny(new char[] { '*', '&', '#' }) != -1

Or in a possibly easier to read form:

var match = str.IndexOfAny("*&#".ToCharArray()) != -1

Depending on the context and performance required, you may or may not want to cache the char array.

Noldorin
change your new char[] {...} to "%$...".ToCharArray() and you have a winner.
Robert Koritnik
what would be the difference?
Decio Lira
@Decio: It would be easier to read and briefer (by the time you've got 10 characters in the array.)
Jon Skeet
@Decio: It's easier to type :)
Aistina
Damn, Jon beat me to it :P
Aistina
oh, I get it. thanks ;)
Decio Lira
+10  A: 

As others have said, use IndexOfAny. However, I'd use it in this way:

private static readonly char[] Punctuation = "*&#...".ToCharArray();

public static bool ContainsPunctuation(string text)
{
    return text.IndexOfAny(Punctuation) >= 0;
}

That way you don't end up creating a new array on each call. The string is also easier to scan than a series of character literals, IMO.

Of course if you're only going to use this once, so the wasted creation isn't a problem, you could either use:

private const string Punctuation = "*&#...";

public static bool ContainsPunctuation(string text)
{
    return text.IndexOfAny(Punctuation.ToCharArray()) >= 0;
}

or

public static bool ContainsPunctuation(string text)
{
    return text.IndexOfAny("*&#...".ToCharArray()) >= 0;
}

It really depends on which you find more readable, whether you want to use the punctuation characters elsewhere, and how often the method is going to be called.


EDIT: Here's an alternative to Reed Copsey's method for finding out if a string contains exactly one of the characters.

private static readonly HashSet<char> Punctuation = new HashSet<char>("*&#...");

public static bool ContainsOnePunctuationMark(string text)
{
    bool seenOne = false;

    foreach (char c in text)
    {
        // TODO: Experiment to see whether HashSet is really faster than
        // Array.Contains. If all the punctuation is ASCII, there are other
        // alternatives...
        if (Punctuation.Contains(c))
        {
            if (seenOne)
            {
                return false; // This is the second punctuation character
            }
            seenOne = true;
        }
    }
    return seenOne;
}
Jon Skeet
I suppose it's worth caching the char array if performance is a problem, but then again it may not be worth it depending on the context.
Noldorin
Yes, if you're only using it in a method that's going to be executed once it may not be worth it. However, I think it improves the readability as well as the performance. You could use the `ToCharArray` form "inline" if required, of course.
Jon Skeet
You might want to make Punctuation static in the second code block.
rein
@rein: const implies static.
Jon Skeet
+4  A: 

If you just want to see if it contains any character, I'd recommend using string.IndexOfAny, as suggested elsewhere.

If you want to verify that a string contains exactly one of the ten characters, and only one, then it gets a bit more complicated. I believe the fastest way would be to check against an Intersection, then check for duplicates.

private static char[] characters = new char [] { '*','&',... };

public static bool ContainsOneCharacter(string text)
{
    var intersection = text.Intersect(characters).ToList();
    if( intersection.Count != 1)
        return false; // Make sure there is only one character in the text

    // Get a count of all of the one found character
    if (1 == text.Count(t => t == intersection[0]) )
        return true;

    return false;
}
Reed Copsey
I don't think that's *really* the fastest way. Editing my answer :)
Jon Skeet
Yeah - I suppose a single loop is probably faster in this case, especially with the small set of punctuation. I'd be curious to try testing this with large strings to see which is truly faster.
Reed Copsey
I think that finding the intersection of the two strings is going to have to go character by character anyway, so I can't see how it would be faster... and my suggested route not only uses a single pass, but also has the option of an "early out". Imagine if text is a million characters long, but the first two are both "*" :)
Jon Skeet
Yeah - your hash set's a better approach.
Reed Copsey