tags:

views:

141

answers:

6

A string must not include spaces or special characters. Only a-z, A-Z, 0-9, the underscore, and the period characters are allowed.

How do I achieve this?

Update:
All the solutions posted worked for me.

Thanks everyone for helping out.

+3  A: 
if (!myString.matches("^[a-zA-Z0-9._]*$")) {
    // fail ...
}

or you can use the \w character class (shorthand for [a-zA-Z_0-9])

if (!myString.matches("^[\\w.]*$")) {
    // fail ...
}
Asaph
I think that using the \w for word characters and \d for digits might make it more readable.
Thomas Owens
\w is alphanumeric + _
The requirements may well call for rejecting foreign language letters so second guessing the character set could have disastrous consequences.
PP
@PP: I don't think the `\w` character class is locale sensitive.
Asaph
@Thomas Owens: I updated my answer to offer an alternative that uses the `\w` character class.
Asaph
I think there is no need for using '^' and '$' when using `matches`. Another story if using `find` ...
Carlos Heuberger
A: 

"[\\w,]+" should do the trick

Devon_C_Miller
I don't think digits are included in \w...
Thomas Owens
@Thomas Owens: You may not think so, but they are: `\w A word character: [a-zA-Z_0-9]` -- JavaDoc for java.util.regex.Pattern
R. Bemrose
@Thomas Owens: Digits *are* included in `\w`. The problem with the regex above is that it includes `,` which is not an allowed character according to the OP and it doesn't allow `.` which is supposed to be allowed. Additionally, the regex doesn't anchor at the beginning and end with `^` and `$` respectively.
Asaph
Ah. Thanks for the clarification. It's been a while since I played with regex.
Thomas Owens
A: 

You could simply delete all the characters that don't match the set [a-zA-Z0-9_.]. Alternatively you could replace characters not in the set with a valid character (e.g. the underscore). Finally you could altogether reject any string that does not consist solely of characters in the permitted set.

PP
+3  A: 

A different solution:

text = text.replaceAll("[\\w.]", "");

It removes the unwanted characters instead of just detecting them.

From Sun's website:

\w  A word character: [a-zA-Z_0-9]
Mark Byers
+1 for including a link to the docs. Now at least the poster has a chance to learn something so they won't need to come back for more spoon feeding.
camickr
+3  A: 

I am certain by the time I finish typing this, you will have received you answer. So here is some genuine advice to go with it - Take the time (hour or so) to learn the basics of regular expressions.

You will be surprised how often they show up in solutions to 'real world' problems.

Great testing resource -> http://gskinner.com/RegExr/

cap3t0wn
A: 

You can either make a "all characters must be one of these" regular expression or simply ask if any of the characters you dislike are present at all and if so reject the string. I believe the latter will be the easiest to write and understand later.

Thorbjørn Ravn Andersen