views:

472

answers:

10

Possible Duplicate:
Constant abuse?

I've seen -1 used in various APIs, most commonly when searching into a "collection" with zero-based indices, usually to indicate the "not found" index. This "works" because -1 is never a legal index to begin with. It seems that any negative number should work, but I think -1 is almost always used, as some sort of (unwritten?) convention.

I would like to limit the scope to Java at least for now. My questions are:

  • What are the official words from Sun regarding using -1 as a "special" return value like this?
  • What quotes are there regarding this issue, from e.g. James Gosling, Josh Bloch, or even other authoritative figures outside of Java?
  • What were some of the notable discussions regarding this issue in the past?
+3  A: 

Both Java and JavaScript use -1 when an index isn't found. Since the index is always 0-n it seems a pretty obvious choice.

//JavaScript
var url = 'example.com/foo?bar&admin=true';
if(url.indexOf('&admin') != -1){
  alert('we likely have an insecure app!');
}

I find this approach (which I've used when extending Array-type elements to have a .indexOf() method) to be quite normal.

On the other hand, you can try the PHP approach e.g. strpos() but IMHO it gets confusing as there are multiple return types (it returns FALSE when not found)

scunliffe
Also true in .NET.
Richard
Yes, it's extremely common. Still not sure that actually makes it good. :)
back2dos
Being common, thus the "status quo", makes it familiar. I'd rather go with what other developers are "expecting" than try and re-invent the wheel. e.g. I expect indexes to be zero-based, index-not-found == -1, a toString() method implementation on every object, etc.
scunliffe
+1  A: 

It is good practice to define a final class variable for all constant values in your code. But it is general accepted to use 0, 1, -1, "" (empty string) without an explicit declaration.

PeterMmm
+1  A: 

This is an inheritance from C where only a single primitive value could be returned. In java you Can also return a single object.

So for new code return an object of a basetype with the subtype indicating the problem to be used with instaceof, or throw a "not Found" exception.

For existing special values make -1 a constant in your code names accordingly - NOT_FOUND - so the reader Can tell the meaning without having to check javadocs.

Thorbjørn Ravn Andersen
+1  A: 

The same practice as with null applies to -1. Its been discussed many times.

e.g. Java api design - NULL or Exception

takoi
+1  A: 

Its used because its the first invalid value you encounter in 0-based arrays. As you know, not all types can hold null or nothing so need "something" to signify nothing.

I would say its not official, it has just become convention (unwritten) because its very sensible for the situation. Personally, I wouldn't also call it an issue. API design is also down to the author, but guidelines can be found online.

Adam
+9  A: 

This is a common idiom in languages where the types do not include range checks. An "out of bounds" value is used to indicate one of several conditions. Here, the return value indicates two things: 1) was the character found, and 2) where was it found. The use of -1 for not found and a non-negative index for found succinctly encodes both of these into one value, and the fact that not-found does not need to return an index.

In a language with strict range checking, such as Ada or Pascal, the method might be implemented as (pseudo code)

   bool indexOf(c:char, position:out Positive);

Positive is a subtype of int, but restricted to non-negative values.

This separates the found/not-found flag from the position. The position is provided as an out parameter - essentialy another return value. It could also be an in-out parameter, to start the search from a given position. Use of -1 to indicate not-found would not be allowed here since it violates range checks on the Positive type.

The alternatives in java are:

  • throw an exception: this is not a good choice here, since not finding a character is not an exceptional condition.
  • split the result into several methods, e.g. boolean indexOf(char c); int lastFoundIndex();. This implies the object must hold on to state, which will not work in a concurrent program, unless the state is stored in thread-local storage, or synchronization is used - all considerable overheads.
  • return the position and found flag separately: such as boolean indexOf(char c, Position pos). Here, creating the position object may be seen as unnecessary overhead.
  • create a multi-value return type

such as

class FindIndex {
   boolean found;
   int position;
}

FindIndex indexOf(char c);

although it clearly separates the return values, it suffers object creation overhead. Some of that could be mitigated by passing the FindIndex as a parameter, e.g.

FindIndex indexOf(char c, FindIndex start);

Incidentally, multiple return values were going to be part of java (oak), but were axed prior to 1.0 to cut time to release. James Gosling says he wishes they had been included. It's still a wished-for feature.

My take is that use of magic values are a practical way of encoding a multi-valued results (a flag and a value) in a single return value, without requiring excessive object creation overhead.

However, if using magic values, it's much nicer to work with if they are consistent across related api calls. For example,

   // get everything after the first c
   int index = str.indexOf('c');
   String afterC = str.substring(index);

Java falls short here, since the use of -1 in the call to substring will cause an IndeOutOfBoundsException. Instead, it might have been more consistent for substring to return "" when invoked with -1, if negative values are considered to start at the end of the string. Critics of magic values for error conditions say that the return value can be ignored (or assumed to be positive). A consistent api that handles these magic values in a useful way would reduce the need to check for -1 and allow for cleaner code.

mdma
+1 for the compound return value approach. very clean, although it adds an overhead in performance and code. also it is a little cumbersome, if you actually know the value is in the collection.
back2dos
Why would `indexOf()` accept `FindIndex` as an *argument*? What does it mean to indicate that the starting index is "not found"?
seh
Also, prior art for something like `FindIndex`: Haskell's `Maybe` type, Scala's `Option` type, and any other models for monadic zero in the "maybe/optional" monad. Even a pointer or a potentially null reference to an integer counts.
seh
For the api to provide least surprise, I imagine it would search from the index given in the FindIndex, ignoring the "found" flag.
mdma
indexOf() is not accepting it as an argument, it's taking it as a place to put its return values.
Jeanne Pindar
+1  A: 

As far as I know, such values are called sentinel values, although most common definitions differ slightly from this scenario.

Languages such as Java chose to not support passing by reference (which I think is a good idea), so while the values of individual arguments are mutable, the variables passed to a function remain unaffected. As a consequence of this, you can only have one return value of only one type. So what you do is to chose an otherwise invalid value of a valid type, and return it to transport additional semantics, because the return value is not actually the return value of the operation but a special signal.

Now I guess, the cleanest approach would be to have a contains and an indexOf method, the second of which would throw an exception, if the element you're asking for is not in the collection. Why? Because one would expect the following to be true:

someCollection.objectAtIndex(someCollection.indexOf(someObject)) == someObject

What you're likely to get is an exception because -1 is out of bounds, while the actual reason why this plausible relation is not true is, that someObject is not an element of someCollection, and that is why the inner call should raise the exception.

Now as clean and robust, as this may be, it has two key flaws:

  • Usually both operations would usually cost you O(n) (unless you have an inverse map within the collection), so you're better off if you do just one.
  • It is really quite verbose.

In the end, it's up to you to decide. This is a matter of philosophy. I'd call it a "semantic hack" to achieve both shortness & speed at the cost of robustness. Your call ;)

greetz
back2dos

back2dos
+1  A: 

like why 51% means everything among shareholders of a company, since it's the best nearest and makes sense rather than -2 or -3 ...

Xaqron
+3  A: 

Is -1 a magic number?

In this context, not really. There is nothing special about -1 ... apart from the fact that it is guaranteed to be an invalid index value by virtue of being negative.

An anti-pattern?

No. To qualify as an anti-pattern there would need to be something harmful about this idiom. I see nothing harmful in using -1 this way.

A code smell?

Ditto. (It is arguably better style to use a named constant rather than a bare -1 literal. But I don't think that is what you are asking about, and it wouldn't count as "code smell" anyway, IMO.)

Quotes and guidelines from authorities

Not that I'm aware of. However, I would observe that this "device" is used in various standard classes. For example, String.indexOf(...) returns -1 to say that the character or substring could not be found.


As far as I am concerned, this is simply an "algorithmic device" that is useful in some cases. I'm sure that if you looked back through the literature, you will see examples of using -1 (or 0 for languages with one-based arrays) this way going back to the 1960's and before.

The choice of -1 rather than some other negative number is simply a matter of personal taste, and (IMO) not worth analyzing., in this context.


It may be a bad idea for a method to return -1 (or some other value) to indicate an error instead of throwing an exception. However, the problem here is not the value returned but the fact that the method is requiring the caller to explicitly test for errors.

The flip side is that if the "condition" represented by -1 (or whatever) is not an "error" / "exceptional condition", then returning the special value is both reasonable and proper.

Stephen C
+1  A: 

-1 as a return value is slightly ugly but necessary. The alternatives to signal a "not found" condition are IMHO all much worse:

  • You could throw an Exception, but this isn't ideal because Exceptions are best used to signal unexpected conditions that require some form of recovery or propagated failure. Not finding an occurrence of a substring is actually pretty expected. Also Exception throwing has a significant performance penalty.

  • You could use a compound result object with (found,index) but this requires an object allocation and more complex code on the part of the caller to inspect the result.

  • You could separate out two separate function calls for contains and indexOf - however this is again quite cumbersome for the caller and also results in a performance hit as both calls would be O(n) and require a full traversal of the String.

Personally, I never like to refer to the -1 constant: my test for not-found is always something like:

int i = someString.indexOf("substring");
if (i>=0) {
  // do stuff with found index
} else {
  // handle not found case
}
mikera