tags:

views:

154

answers:

5

I have a Java String object. I need to extract only digits from it. I'll give an example:

"123-456-789" I want "123456789"

Is there a library function that extracts only digits?

Thanks for the answers. Before I try these I need to know if I have to install any additional llibraries?

+15  A: 

You can use regex and delete non-digits.

str = str.replaceAll("\\D+","");

Working link

codaddict
nice short code. A linear search might be faster but i think yours makes more sense.
kasten
Downvoter: Care to explain?
codaddict
@codaddict Sure, while replaceAll is an incredibly short and apparently easy piece of code, its incredibly inefficient for several reasons. Regex breaks refactoring and makes it difficult to maintain the code, it is also an incredibly heavy operation for such a simple task. We've all done it at some point, but its simply not good code, sorry.
BjornS
@BjornS oh please: this is a standard solution, easy, understandable and fast enough for most purposes. It may not be a best practice (although I'd argue about that) but it certainly doesn't deserve a downvote.
seanizer
@seanizer ok, When does standard become best practise or maintainable? I down voted this not because I couldn't understand this or because it isn't standard but rather that it perpetuates bad code. As I said I've done the same myself but if at all possible I try to avoid it. If I shouldn't down vote for this then what should I down vote for? I'm sorry if I came across as unpleasant.
BjornS
I guess you can downvote anything you like to downvote (no sarcasm intended). But my personal opinion is: when great developers (and we have lots of them here) share some of their advice for free, then I'm going to honor that, and I only downvote stuff that's really awful (check my profile, my current ratio is 14xx up against 17 down). But that's my personal philosophy and you are free to have your own.
seanizer
+1  A: 
public String extractDigits(String src) {
    StringBuilder builder = new StringBuilder();
    for (int i = 0; i < src.length(); i++) {
        char c = src.charAt(i);
        if (Character.isDigit(c)) {
            builder.append(c);
        }
    }
    return builder.toString();
}
dogbane
I thought of using Character.isDigit() myself, but it also accepts some characters that are not 0-9 (see docs: http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#isDigit%28char%29 )
seanizer
+1  A: 

Here's a more verbose solution. Less elegant, but probably faster:

public static String stripNonDigits(final String input){
    final StringBuilder sb = new StringBuilder();
    for(int i = 0; i < input.length(); i++){
        final char c = input.charAt(i);
        if(c > 47 && c < 58){
            sb.append(c);
        }
    }
    return sb.toString();
}

Test Code:

public static void main(final String[] args){
    final String input = "0-123-abc-456-xyz-789";
    final String result = stripNonDigits(input);
    System.out.println(result);
}

Output:

0123456789

BTW: I did not use Character.isDigit(ch) because it accepts many other chars except 0 - 9.

seanizer
+1  A: 

Using Google Guava:

CharMatcher.DIGIT.retainFrom("123-456-789");

CharMatcher is plug-able and quite interesting to use, for instance you can do the following:

String input = "My phone number is 123-456-789!";
String output = CharMatcher.is('-').or(CharMatcher.DIGIT).retainFrom(input);

output == 123-456-789

BjornS
Very nice solution (+1), but it suffers from the same problem as others: lots of characters qualify as unicode digits, not only the ascii digits. This code will retain all of these characters: http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5Cp%7Bdigit%7D
seanizer
@seanizer: Then will this be better CharMatcher.inRange('1','9').retainFrom("123-456-789")
Emil
@Emil more like CharMatcher.inRange('0','9'), but: yes
seanizer
@seanizer:ya right.i didn't notice that.
Emil
inRange is what lies behind CharMatcher.DIGIT; http://pastie.org/1252471 It simply takes into account attitional UTF number ranges, I would still consider these as digits, since in reality they are, they are simply not ASCII encoded.
BjornS
You can also use CharMatcher.JAVA_DIGIT for the same purpose, that will only accept digits as per Character.isDigit
BjornS
+4  A: 

Using Google Guava:

CharMatcher.inRange('0','9').retainFrom("123-456-789")

UPDATE:

Using Precomputed CharMather can further improve performance

CharMatcher ASCII_DIGITS=CharMatcher.inRange('0','9').precomputed();  
ASCII_DIGITS.retainFrom("123-456-789");
Emil