views:

2551

answers:

14

Seeing as Java doesn't have nullable types, nor does it have a TryParse(), how do you handle input validation without throwing an exceptions?

The usual way:

String userdata = /*value from gui*/
int val;
try
{
   val = Integer.parseInt(userdata);
}
catch (NumberFormatException nfe)
{
   // bad data - set to sentinel
   val = Integer.MIN_VALUE;
}

I could use a regex to check if it's parseable, but that seems like a lot of overhead as well.

What's the best practice for handling this situation?

EDIT: Rationale: There's been a lot of talk on SO about exception handling, and the general attitude is that exceptions should be used for unexcpected scenarios only. However, I think bad user input is EXPECTED, not rare. Yes, it really is an academic point.

A: 

Put some if statements in front of it. if (null != userdata )

arinte
Won't work. What if the user types in "Eleventy" or 6.2
chris
+10  A: 

That's pretty much it, although returning MIN_VALUE is kind of questionable, unless you're sure it's the right thing to use for what you're essentially using as an error code. At the very least I'd document the error code behavior, though.

Might also be useful (depending on the application) to log the bad input so you can trace.

Steve B.
Well, the example was contrived. I could also set a flag in the catch block -- userInputIsGood = false;
chris
Unless your application is all about parsing user-input integers, and those users are really bad at entering integers, I wouldn't expect the exception overhead to be significant.
erickson
+1  A: 

I think the best practice is the code you show.

I wouldn't go for the regex alternative because of the overhead.

Shimi Bandiel
+8  A: 

What's the problem with your approach? I don't think doing it that way will hurt your application's performance at all. That's the correct way to do it. Don't optimize prematurely.

asterite
+1  A: 

The best practice would be to catch the Exception and display an error message to the user. Never set something to some magic number if the input is bad.

Paul Croarkin
+5  A: 

I'm sure it is bad form, but I have a set of static methods on a Utilities class that do things like Utilities.tryParseInt(String value) which returns 0 if the String is unparseable and Utilities.tryParseInt(String value, int defaultValue) which allows you to specify a value to use if parseInt() throws an exception.

I believe there are times when returning a known value on bad input is perfectly acceptable. A very contrived example: you ask the user for a date in the format YYYYMMDD and they give you bad input. It may be perfectly acceptable to do something like Utilities.tryParseInt(date, 19000101) or Utilities.tryParseInt(date, 29991231); depending on the program requirements.

Grant Wagner
I have the same utility method but instead return Integer and use null for invalid integers. At some point, some one up the call chain will have to deal with an invalid integer.
Steve Kuo
A: 

You could use a Integer, which can be set to null if you have a bad value. If you are using java 1.6, it will provide auto boxing/unboxing for you.

Milhous
A: 

The above code is bad because it is equivalent as the following.

// this is bad
int val = Integer.MIN_VALUE;
try
{
   val = Integer.parseInt(userdata);
}
catch (NumberFormatException ignoreException) { }

The exception is ignored completely. Also, the magic token is bad because an user can pass in -2147483648 (Integer.MIN_VALUE).

The generic parse-able question is not beneficial. Rather, it should be relevant to the context. Your application has a specific requirement. You can define your method as

private boolean isUserValueAcceptable(String userData)
{
   return (    isNumber(userData)    
          &&   isInteger(userData)   
          &&   isBetween(userData, Integer.MIN_VALUE, Integer.MAX_VALUE ) 
          );
}

Where you can documentation the requirement and you can create well defined and testable rules.

mjlee
You say "The above code is bad." Why exactly is it better to assign a default value and ignore the exception?
Michael Myers
Oh, I see. You were saying that both were bad.
Michael Myers
I'd love to see how that isBetween() method is implemented without parsing the string... So you wind up parsing the string twice - in validation scenarios, this is quite normal, though. Don't know why anyone modded you down, I'm modding it back up - you are straight on with this.
Kevin Day
+9  A: 

For user supplied data, Integer.parseInt is usually the wrong method because it doesn't support internationisation. The java.text package is your (verbose) friend.

try {
    NumberFormat format = NumberFormat.getIntegerInstance(locale);
    format.setParseIntegerOnly(true);
    format.setMaximumIntegerDigits(9);
    ParsePosition pos = new ParsePosition(0);
    int val = format.parse(str, pos).intValue();
    if (pos.getIndex() != str.length()) {
        // ... handle case of extraneous characters after digits ...
    }
    // ... use val ...
} catch (java.text.ParseFormatException exc) {
    // ... handle this case appropriately ...
}
Tom Hawtin - tackline
A formatter is a good idea for parsing raw user input. The problem lies in the implementation. DecimalFormat can incompletely parse input, like "123.99 + one dozen". Did the user mean "123"? "124"? "136"? The only safe thing is to reject this, but DecimalFormat will silently return 123.
erickson
Yes. Fixed. Thanks.
Tom Hawtin - tackline
+1  A: 

Here's how I do it:

public Integer parseInt(String data) {
  Integer val = null;
  try {
    val = Integer.parseInt(userdata);
  } catch (NumberFormatException nfe) { }
  return val;
}

Then the null signals invalid data. If you want a default value, you could change it to:

public Integer parseInt(String data,int default) {
  Integer val = default;
  try {
    val = Integer.parseInt(userdata);
  } catch (NumberFormatException nfe) { }
  return val;
}
noah
Slight optimization: set the default value in the catch block
Steve Kuo
Steve Kuo - Nah - that's not really an optimization - that's just coding style. javac will generate the same byte code regardless of what you do with that one.
Kevin Day
A: 

I pretty much follow that approach exception I will try to set the value to a default (if available) and log the exception as a WARN.

Javamann
A: 

If you can avoid exceptions by testing beforehand like you said (isParsable()) it might be better--but not all libraries were designed with that in mind.

I used your trick and it sucks because stack traces on my embedded system are printed regardless of if you catch them or not :(

Bill K
A: 

The exception mechanism is valuable, as it is the only way to get a status indicator in combination with a response value. Furthermore, the status indicator is standardized. If there is an error you get an exception. That way you don't have to think of an error indicator yourself. The controversy is not so much with exceptions, but with Checked Exceptions (e.g. the ones you have to catch or declare).

Personally I feel you picked one of the examples where exceptions are really valuable. It is a common problem the user enters the wrong value, and typically you will need to get back to the user for the correct value. You normally don't revert to the default value if you ask the user; that gives the user the impression his input matters.

If you do not want to deal with the exception, just wrap it in a RuntimeException (or derived class) and it will allow you to ignore the exception in your code (and kill your application when it occurs; that's fine too sometimes).

Some examples on how I would handle NumberFormat exceptions: In web app configuration data:

loadCertainProperty(String propVal) {
  try
  {
    val = Integer.parseInt(userdata);
    return val;
  }
  catch (NumberFormatException nfe)
  { // RuntimeException need not be declared
    throw new RuntimeException("Property certainProperty in your configuration is expected to be " +
                               " an integer, but was '" + propVal + "'. Please correct your " +
                               "configuration and start again");
    // After starting an enterprise application the sysadmin should always check availability
    // and can now correct the property value
  }
}

In a GUI:

public int askValue() {
  // TODO add opt-out button; see Swing docs for standard dialog handling
  boolean valueOk = false;
  while(!valueOk) {
    try {
      String val = dialog("Please enter integer value for FOO");
      val = Integer.parseInt(userdata);
      return val; 
    } catch (NumberFormatException nfe) {
      // Ignoring this; I don't care how many typo's the customer makes
    }
  }
}

In a web form: return the form to the user with a usefull error message and a chance to correct. Most frameworks offer a standardized way of validation.

extraneon
A: 

I'm going to restate the point that stinkyminky was making towards the bottom of the post:

A generally well accepted approach validating user input (or input from config files, etc...) is to use validation prior to actually processing the data. In most cases, this is a good design move, even though it can result in multiple calls to parsing algorithms.

Once you know that you have properly validated the user input, then it is safe to parse it and ignore, log or convert to RuntimeException the NumberFormatException.

Note that this approach requires you to consider your model in two pieces: the business model (Where we actually care about having values in int or float format) and the user interface model (where we really want to allow the user to put in whatever they want).

In order for the data to migrate from the user interface model to the business model, it must pass through a validation step (this can occur on a field by field basis, but most scenarios call for validation on the entire object that is being configured).

If validation fails, then the user is presented with feedback informing them of what they've done wrong and given a chance to fix it.

Binding libraries like JGoodies Binding and JSR 295 make this sort of thing a lot easier to implement than it might sound - and many web frameworks provide constructs that separate user input from the actual business model, only populating business objects after validation is complete.

In terms of validation of configuration files (the other use case presented in some of the comments), it's one thing to specify a default if a particular value isn't specified at all - but if the data is formatted wrong (someone types an 'oh' instead of a 'zero' - or they copied from MS Word and all the back-ticks got a funky unicode character), then some sort of system feedback is needed (even if it's just failing the app by throwing a runtime exception).

Kevin Day
wow - a down mod without justification. How pleasant...
Kevin Day