views:

501

answers:

4

The JavaScript String object has two substring functions substring and substr.

  • substring takes two parameters beginIndex and endIndex.

  • substr also takes two parameters beginIndex and length.

It's trivial to convert any range between the two variants but I wonder if there's any significance two how the two normally would be used (in day-to-day programming). I tend to favor the index/length variant but I have no good explanation as to why.

I guess it depends on what kind of programming you do, but if you have strong opinion on the matter, I'd like to hear it.

When is a (absolute, relative) range more suited than an (absolute, absolute) and vice versa?

Update:

This is not a JavaScript question per se (JavaScript just happen to implement both variants [which I think is stupid]), but what practical implication does the relative vs. absolute range have? I'm looking for solid argument for why we prefer one over the other. To broaden the debate a bit, how would you prefer to design your data structures for use with either one approach?

A: 

I slightly prefer the startIndex, endIndex variant, since then to get the last bit of a string I can do:

string foo = bar.substring(5, foo.length());

instead of:

string foo = bar.substring(5, foo.length() - 5);
Dominic Rodger
`length` should be a property not a function
RaYell
Not in C++ it's not, and what bearing does that have on the question?
Dominic Rodger
+1  A: 

When is a (absolute, relative) range more suited than an (absolute, absolute) and vice versa?

The former, when you know how much, the latter when you know where.

I presume substring is implemented in terms of substr:

substring( b, e ) {
  return substr( b, e - b );
}

or substr in terms of substring:

substr( b, l)  {
  return substring( b, b + l );
}
tpdi
+2  A: 

I prefer the startIndex, endIndex variant (substring) because String.substring() operates the same way in Java and I feel it makes me more efficient to stick to the same concepts in whatever language I use most often (when possible).

If I were doing more C# work, I might use the other variant more because that is how String.Substring() works in C#.

To answer your comment about JavaScript having both, it looks like substr() was added to browsers after substring() (reference - it seems that although substr() was part of JavaScript 1.0, most browser vendors didn't implement it until later). This suggests to me that even the implementers of the early language recognized the duplication of functionality. I'd suggest substring() came first in an attempt to leverage the JavaScript trademark. Regardless, it seems that they recognized this duplication in ECMA-262 and took some small steps toward removing it:

  • substring(): ECMA Version: ECMA-262
  • substr(): ECMA Version: None, although ECMA-262 ed. 3 has a non-normative section suggesting uniform semantics for substr

Personally I wouldn't mind a substring() where the second parameter can be negative, which would return the characters between the first parameter and the length of the string minus the second parameter. Of course you can already achieve that more explicitly and I imagine the design would be confusing to many developers:

String s1 = "The quick brown fox jumps over the lazy dog";
String s2 = s1.substring(20, -13); // "jumps over"
Grant Wagner
I like the idea where the operation can not throw an argument out of range exception or similar error.
John Leidegren
A: 

It depends on the case, but I more often find I know exactly how many characters I want to take out, and prefer the start with length parameterization. But I could easily see a case where I've searched a long string for two tokens and now have their indexes, while it's trivial math to use either case, in this case I might prefer the start and end indexes.

Also, from a document writer's perspective, having two parameters of the same basic meaning is probably easier to write about and an easier mnemonic.

Each of these functions does neat saves when given strange values, such as an end smaller than a start, a negative length, a negative start, or a length or end beyond the string's end.

For JavaScript the best practice is to use substring over substr because it's supported in more (albeit usually older) browsers. If they'd gone with BasicScript instead would there have been a MID() and a MIDDLE() function? Who doesn't love BASIC syntax?

dlamblin