tags:

views:

1532

answers:

14

I normally use the following idiom to check if a String can be converted to an integer.

public boolean isInteger( String input ) {
    try {
        Integer.parseInt( input );
        return true;
    }
    catch( Exception e ) {
        return false;
    }
}

Is it just me, or does this seem a bit hackish? What's a better way?

EDIT: See my answer (with benchmarks, based on the earlier answer by rally25rs) to see why I've reversed my position and accepted Jonas Klemmings answer to this problem. I think this original code will be used by most people because it's quicker to implement, and more maintainable, but it's orders of magnitude slower when non-integer data is provided.

A: 

How about:

return Pattern.matches("-?\\d+", input);
Kristian
What about the integer 9999999999999999999999999999999999 ?
danatel
Don't forget to check for the negative sign.
yjerem
don't you need to anchor the begining and end of the regex, so you won't pass "aaa-1999zzz"?
Tim Howland
Tim, when you call one of the matches() methods (String, Pattern and Matcher each have one), the regex has to match the whole input, making anchors redundant. To find a match as defined by most other regex flavors, you have to use Matcher#find().
Alan Moore
+17  A: 

You have it, but you should only catch NumberFormatException.

Ovidiu Pacurar
Yeah, it's considered bad form to catch more exceptions than you need.
phasetwenty
You're right. NFE is the only one that can be thrown, but it's still a bad habit to get into.
Bill the Lizard
I think a NPE can be thrown if input is null, so your method should probably handle that explicitly, whichever way you want to.
Dov Wasserman
@Dov: You're right NPE and NFE should both be explicitly caught.
Bill the Lizard
+3  A: 

This is shorter, but shorter isn't necessarily better (and it won't catch integer values which are out of range, as pointed out in danatel's comment):

input.matches("^-?\\d+$");

Personally, since the implementation is squirrelled away in a helper method and correctness trumps length, I would just go with something like what you have (minus catching the base Exception class rather than NumberFormatException).

insin
And maybe \\d{1,10} is, although not perfect, better than \\d+ for catching Java Integers
Maglob
+2  A: 
is_number = true;
try {
  Integer.parseInt(mystr)
} catch (NumberFormatException  e) {
  is_number = false;
}
Ricardo Acras
+1  A: 

What you did works, but you probably shouldn't always check that way. Throwing exceptions should be reserved for "exceptional" situations (maybe that fits in your case, though), and are very costly in terms of performance.

lucas
They're only costly if they get thrown.
Bill the Lizard
didn't realize.. thanks!
lucas
+8  A: 

If you are not concerned with potential overflow problems this function will perform about 20-30 times faster than using Integer.parseInt().

public static boolean isInteger(String str) {
 if (str == null) {
  return false;
 }
 int length = str.length();
 if (length == 0) {
  return false;
 }
 int i = 0;
 if (str.charAt(0) == '-') {
  if (length == 1) {
   return false;
  }
  i = 1;
 }
 for (; i < length; i++) {
  char c = str.charAt(i);
  if (c <= '/' || c >= ':') {
   return false;
  }
 }
 return true;
}
Jonas Klemming
(c <= '/' || c >= ':') is a bit strange looking. I would have used (c < '0' || c > '9')... are the <= and >= operators faster in Java?
Anonymous
Why not use regex? Isn't return str.matches("^-?\\d+$") identical to code above.
Maglob
I would use this method or the original method from the question before regex. This for performance, the original method for speed of implementation and sheer maintainability. The regex solution has nothing going for it.
Bill the Lizard
I am worried about overflow, but this method can be adapted for BigInts and still be way faster than other methods. In case anyone is wondering why I'm putting so much effort into such a simple problem, I'm creating a library to aid in solving Project Euler problems.
Bill the Lizard
Why not use java.lang.Character.isDigit()?
Eric Weilnau
isDigit returns true for things like Devanagari digits. While Integer.parseInt handles these as digits, it may not be what's expected for most use cases.
erickson
A: 

Integer.valueOf(string); works for me most of the time!

anjanb
It's that (all - most) of the time that I'm worried about. :)
Bill the Lizard
+6  A: 

Did a quick benchmark. Exceptions aren't actually that expensivve, unless you start popping back multiple methods and the JVM has to do a lot of work to get the execution stack in place. When staying in the same method, they aren't bad performers.

public void RunTests()
{
    String str = "1234567890";

    long startTime = System.currentTimeMillis();
    for(int i = 0; i < 100000; i++)
        IsInt_ByException(str);
    long endTime = System.currentTimeMillis();
    System.out.print("ByException: ");
    System.out.println(endTime - startTime);

    startTime = System.currentTimeMillis();
    for(int i = 0; i < 100000; i++)
        IsInt_ByRegex(str);
    endTime = System.currentTimeMillis();
    System.out.print("ByRegex: ");
    System.out.println(endTime - startTime);

    startTime = System.currentTimeMillis();
    for(int i = 0; i < 100000; i++)
        IsInt_ByJonas(str);
    endTime = System.currentTimeMillis();
    System.out.print("ByJonas: ");
    System.out.println(endTime - startTime);
}

private boolean IsInt_ByException(String str)
{
    try
    {
        Integer.parseInt(str);
        return true;
    }
    catch(NumberFormatException nfe)
    {
        return false;
    }
}

private boolean IsInt_ByRegex(String str)
{
    return str.matches("^-?\\d+$");
}

public boolean IsInt_ByJonas(String str)
{
    if (str == null) {
            return false;
    }
    int length = str.length();
    if (length == 0) {
            return false;
    }
    int i = 0;
    if (str.charAt(0) == '-') {
            if (length == 1) {
                    return false;
            }
            i = 1;
    }
    for (; i < length; i++) {
            char c = str.charAt(i);
            if (c <= '/' || c >= ':') {
                    return false;
            }
    }
    return true;
}

Output:

ByException: 31

ByRegex: 453

ByJonas: 16

I do agree that Jonas K's solution is the most robust too. Looks like he wins :)

rally25rs
Great idea to benchmark all three. To be fair to the Regex and Jonas methods, you should test with non-integer strings, since that's where the Integer.parseInt method is going to really slow down.
Bill the Lizard
+1  A: 

You can also use the Scanner class, and use hasNextInt() - and this allows you to test for other types, too, like floats, etc.

Matthew Schinckel
This answer gave me a reminder I needed. I completely forgot Scanner had such a function. T-up
Codemonkey
+3  A: 

It partly depend on what you mean by "can be converted to an integer".

If you mean "can be converted into an int in Java" then the answer from Jonas is a good start, but doesn't quite finish the job. It would pass 999999999999999999999999999999 for example. I would add the normal try/catch call from your own question at the end of the method.

The character-by-character checks will efficiently reject "not an integer at all" cases, leaving "it's an integer but Java can't handle it" cases to be caught by the slower exception route. You could do this bit by hand too, but it would be a lot more complicated.

Jon Skeet
A: 
Number number;
try {
    number = NumberFormat.getInstance().parse("123");
} catch (ParseException e) {
    //not a number - do recovery.
    e.printStackTrace();
}
//use number
Ran Biron
+4  A: 

Just one comment about regexp. Every example provided here is wrong!. If you want to use regexp don't forget that compiling the pattern take a lot of time. This:

str.matches("^-?\\d+$")

and also this:

Pattern.matches("-?\\d+", input);

causes compile of pattern in every method call. To used it correctly follow:

import java.util.regex.Pattern;

/**
 * @author Rastislav Komara
 */
public class NaturalNumberChecker {
    public static final Pattern PATTENR = Pattern.compile("^\\d+$");

    boolean isNaturalNumber(CharSequence input) {
        return input != null && PATTENR.matcher(input).matches();
    }
}
Rastislav Komara
You can squeeze out a little more performance by creating the Matcher ahead of time, too, and using its reset() method to apply it to the input.
Alan Moore
+3  A: 

I copied the code from rally25rs answer and added some tests for non-integer data. The results are undeniably in favor of the method posted by Jonas Klemming. The results for the Exception method that I originally posted are pretty good when you have integer data, but they're the worst when you don't, while the results for the RegEx solution (that I'll bet a lot of people use) were consistently bad.

public void runTests()
{
    String big_int = "1234567890";
    String non_int = "1234XY7890";

    long startTime = System.currentTimeMillis();
    for(int i = 0; i < 100000; i++)
        IsInt_ByException(big_int);
    long endTime = System.currentTimeMillis();
    System.out.print("ByException - integer data: ");
    System.out.println(endTime - startTime);

    startTime = System.currentTimeMillis();
    for(int i = 0; i < 100000; i++)
        IsInt_ByException(non_int);
    endTime = System.currentTimeMillis();
    System.out.print("ByException - non-integer data: ");
    System.out.println(endTime - startTime);

    startTime = System.currentTimeMillis();
    for(int i = 0; i < 100000; i++)
        IsInt_ByRegex(big_int);
    endTime = System.currentTimeMillis();
    System.out.print("\nByRegex - integer data: ");
    System.out.println(endTime - startTime);

    startTime = System.currentTimeMillis();
    for(int i = 0; i < 100000; i++)
        IsInt_ByRegex(non_int);
    endTime = System.currentTimeMillis();
    System.out.print("ByRegex - non-integer data: ");
    System.out.println(endTime - startTime);

    startTime = System.currentTimeMillis();
    for(int i = 0; i < 100000; i++)
        IsInt_ByJonas(big_int);
    endTime = System.currentTimeMillis();
    System.out.print("\nByJonas - integer data: ");
    System.out.println(endTime - startTime);

    startTime = System.currentTimeMillis();
    for(int i = 0; i < 100000; i++)
        IsInt_ByJonas(non_int);
    endTime = System.currentTimeMillis();
    System.out.print("ByJonas - non-integer data: ");
    System.out.println(endTime - startTime);
}

private boolean IsInt_ByException(String str)
{
    try
    {
        Integer.parseInt(str);
        return true;
    }
    catch(NumberFormatException nfe)
    {
        return false;
    }
}

private boolean IsInt_ByRegex(String str)
{
    return str.matches("^-?\\d+$");
}

public boolean IsInt_ByJonas(String str)
{
    if (str == null) {
            return false;
    }
    int length = str.length();
    if (length == 0) {
            return false;
    }
    int i = 0;
    if (str.charAt(0) == '-') {
            if (length == 1) {
                    return false;
            }
            i = 1;
    }
    for (; i < length; i++) {
            char c = str.charAt(i);
            if (c <= '/' || c >= ':') {
                    return false;
            }
    }
    return true;
}

Results:

ByException - integer data: 47
ByException - non-integer data: 547

ByRegex - integer data: 390
ByRegex - non-integer data: 313

ByJonas - integer data: 0
ByJonas - non-integer data: 16
Bill the Lizard
Thanks for picking up my slack! :)
rally25rs
Thanks for writing the original benchmark. I never would have guessed that regex was that bad, or that throwing exceptions was that much worse.
Bill the Lizard
+1  A: 
org.apache.commons.lang.StringUtils.isNumeric

though Java's standard lib really misses such utility functions

I think that Apache Commons is a "must have" for every Java programmer

too bad it isn't ported to Java5 yet

lbownik
The only problem with this is overflow :SI still give you +1 for mentioning commons-lang :)
javamonkey79