views:

2147

answers:

4

Hi everyone,

I'm new to Java and I'm trying to achieve something pretty simple but I am not allowed to use regex... Which is my favorite tool to do that type of task. Basically I need to make sure a string only contains alpha, numeric, space and dashes.

I found the class org.apache.commons.lang.StringUtils and the almost adequate method "isAlphanumericSpace(String)"... but I also need to include dashes... Without using regex any other simple method that would come to mind?

+3  A: 

Just iterate through the string, using the character-class methods in java.lang.Character to test whether each character is acceptable or not. Which is presumably all that the StringUtils methods do, and regular expressions are just a way of driving a generalised engine to do much the same.

araqnid
+6  A: 

Hum... just program it yourself using String.chatAt(int), it's pretty easy...

Iterate through all char in the string using a position index, then compare it using the fact that ASCII characters 0 to 9, a to z and A to Z use consecutive codes, so you only need to check that character x numerically verifies one of the conditions:

  • between '0' and '9'
  • between 'a' and 'z'
  • between 'A and 'Z'
  • a space ' '
  • a hyphen '-'

Here is a basic code sample (using CharSequence, which lets you pass a String but also a StringBuilder as arg):

public boolean isValidChar(CharSequence seq) {
    int len = seq.length();
    for(int i=0;i<len;i++) {
        char c = seq.charAt(i);
        // Test for all positive cases
        if('0'<=c && c<='9') continue;
        if('a'<=c && c<='z') continue;
        if('A'<=c && c<='Z') continue;
        if(c==' ') continue;
        if(c=='-') continue;
        // ... insert more positive character tests here
        // If we get here, we had an invalid char, fail right away
        return false;
    }
    // All seen chars were valid, succeed
    return true;
}
Varkhan
I would use the java.lang.Character tests instead of making assumptions based on the ASCII character set.
kenj0418
Yes, Character.isLetterOrDigit() does that, but it comes with a very notable performance cost (4 or 5 times slower than a simple code point comparison).
Varkhan
It will reject lots of other valid alpha characters - that just aren't used much in English, just so it will take 1 μs instead of 4μs. (yeah it'll reject "μs" :-) )Making assumptions he didn't say in order to get a minor performance gain he didn't ask for isn't a good idea.
kenj0418
+1  A: 

You have 1 of 2 options: 1. Compose a list of chars that CAN be in the string, then loop over the string checking to make sure each character IS in the list. 2. Compose a list of chars that CANNOT be in the string, then loop over the string checking to make sure each character IS NOT in the list.

Choose whatever option is quicker to compose the list.

MStodd
+1  A: 

You could use:

isAlphanumericSpace(string.replace('-', ' '));

Skip Head