tags:

views:

108

answers:

4

Hi All, How can I remove email address from a string? And all other digits and special characters?

Sample String can be

"Hello world my # is 123 mail me @ [email protected]"

Out put string should be

"Hello world my is mail me"

I googled this and found that I can use following regular expressions

"[^A-Za-z0-9\\.\\@_\\-~#]+"

but that example was more to check valid email ids not removing it. I am new to java!

+1  A: 

Check out the Java regular expression Pattern class and its uses. There's a useful tutorial here which includes replacement methods.

An aside: this is a particularly robust regexp to use for RFC822-compliant email addresses :-) You should be able to come up with something more concise for your needs! There's a discussion of email regexps and trade-offs here.

Brian Agnew
This one is unnecessarily overwhelming long and doesn't cover the upcoming ICANN decision to allow unicode characters (Arabic, Chinese, Hebrew, Japanese, Cyrillic, etc) in URL's and email addresses.
BalusC
It was meant to be tongue-in-cheek and not particularly to be recommended. I will add a smiley.
Brian Agnew
And note the proviso re. RFC822
Brian Agnew
Oh, the smiley :)
BalusC
I know - I thought I'd written it originally and I hadn't :-(
Brian Agnew
+1  A: 

You can use String#replaceAll() for this. Just let it replace any regex matches by an empty string "". The regex you mentioned is however not very robust. A better one is this (copied from here and slightly changed for use in plain vanilla text):

string = string.replaceAll("([^.@\\s]+)(\\.[^.@\\s]+)*@([^.@\\s]+\\.)+([^.@\\s]+)", "");

Hope this helps.

BalusC
Note that he doesn't just want to remove email addresses, he wants to remove *all* "special" characters (for some unknown definition of "special", but clearly including numbers, a hash mark, and an ampersand...).
delfuego
Oh. For that just use `string = string.replaceAll("[^\\p{Alpha}\\s]", "");` afterwards.
BalusC
It worked goood !
A: 

From your example, it looks like it's not just email addresses you're interested in removing, it's all non-alpha characters, so this is trivial:

str = str.replaceAll("([^.@\\s]+)(\\.[^.@\\s]+)*@([^.@\\s]+\\.)+([^.@\\s]+)", "")
         .replaceAll("[^\\p{Alpha} ]", "")
         .replaceAll("[ ]{2,}+", " ");

See the Pattern JavaDocs for information about what the special character class \p{Alpha} means...

delfuego
You're right that the OP doesn't want to only remove the email but I don't think he wants to remove spaces :)
Pascal Thivent
Fair point -- modified, and added a check for multiple spaces to collapse them into one, as per his example.
delfuego
And finally used BalusC's code for email check, to make it all as per the OP's example.
delfuego
+1  A: 

As pointed out by others, you could use regular expressions to clean up your String and replace unwanted part by an empty string "". To do so, have a look at the replaceAll(String regex, String replacement) method of the String class and at the Pattern class for the syntax of regular expressions in Java.

Below, some code demonstrating one way to clean the provided sample String (maybe not the most elegant though):

String input = "Hello world my # is 123 mail me @ [email protected]";
String EMAIL_PATTERN = "([^.@\\s]+)(\\.[^.@\\s]+)*@([^.@\\s]+\\.)+([^.@\\s]+)";

String output = input.replaceAll(EMAIL_PATTERN, "") // Replace emails 
                                                    // by an empty string
        .replaceAll("\\p{Punct}", "") // Replace all punctuation. One of
                                      // !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
        .replaceAll("\\d", "") // Replace any digit by an empty string
        .replaceAll("\\p{Blank}{2,}+", " "); // Replace any Blank (a  space or 
                                             // a tab) repeated more than once
                                             // by a single space.

System.out.println(output);

Running this code produces the following output:

Hello world my is mail me 

If you need to remove more garbage (or less, like punctuation), well, you've got the principle. Adapt it to suit your needs.

Pascal Thivent
Thank you so much !you are great !