tags:

views:

55

answers:

3
+1  Q: 

Java regex help

Can somebody please show me how to do a Java regex that takes in a string and returns a string with all characters removed BUT a-z and 0-9?

I.e. given a string a%4aj231*9.+ it will return a4aj2319

thanks.

+1  A: 

\d is digit, \p{L} is a-z and A-Z.

str.replaceAll("[^\\d\\p{L}]", "");
lins314159
thanks for the quick reply..I just realized, what if I want to preserve spaces as well?
vbn
Just add any characters you don't want to replace within the square brackets.
lins314159
\p{L} also matches a ton of other Unicode characters, such as Δ, ね, and 傻. This expression will leave them all intact.
Sean
A: 

If you want a-z and 0-9 but not A-Z then

str.replaceAll("[^\\p{Lower}\\p{Digit}]", "");
Lombo
I didn't go with \w because that includes underscores as well.
lins314159
You are right, I'll change that
Lombo
A: 
str = str.replaceAll("[^a-z0-9]+", "");

If you also meant to include uppercase characters, then you could use

str = str.replaceAll("[^A-Za-z0-9]+", "");

or the slightly leeter

str = str.replaceAll("[_\\W]+", "");
Sean