views:

191

answers:

7

I think regex is pretty fast and the third option is confusing. What do you think?

http://jqfundamentals.com/book/ch09s12.html

// old way
if (type == 'foo' || type == 'bar') { ... }

// better
if (/^(foo|bar)$/.test(type)) { ... }

// object literal lookup 
if (({ foo : 1, bar : 1 })[type]) { ... }
+2  A: 
  1. The first option involves potentially two string compares.
  2. The second option involves a parse each time.
  3. The third option does a simple hash of the string and then a hash table look up, which is the most efficient in this case, in terms of the amount of work that needs to be done.

The third option also scales better than the other two as more alternative strings are added, because the first two are O(n) and the third is O(1) in the average case.

If we want to talk about which option is prettier / more maintainable, that's a whole separate conversation.

Michael Goldshteyn
The third option is O(n) also: you have to create hash first. (unless you want cache it manually, of course)
Nikita Rybak
Of course, it also involves a completely unnecessary memory allocation and still requires the string comparisons for the true case (the hashes *can* be sufficient for the false case, but not the true). It's also difficult for maintainers to read and any performance benefit almost certainly implementation-dependent and not worth the trouble in any but edge cases.
T.J. Crowder
Yes, but presumably the hash is created once and then reused with a good javascript interpreter, because the values are constant. I do agree with T.J. Crowder on it being more difficult to read, though.
Michael Goldshteyn
@Michael: So, not the one in the most popular browser on the planet, then.
T.J. Crowder
@TJ, There's always IE9 :) .
Michael Goldshteyn
@Michael: We're all hoping, yeah. :-)
T.J. Crowder
A: 

The object literal lookup is optimized with a hash lookup which only requires one logical check instead of n. In a longer list you also won't have to repeat "type == " a zillion times.

Peter DeWeese
but you have to allocate and create the object
olliej
Yes, but with a well implemented JS interpreter it doesn't have to be created multiple times. I'm not saying that it is necessarily implemented well.
Peter DeWeese
You can't not allocate the object every time, as the object can escape. For example:
olliej
erk, hitting enter in comments submits why?
olliej
Object.prototype.__defineGetter__("wibble", function(){this.x = old.x+1; old=this;}); then if ({....}[someUnknownValue]) { ... } can escape and you have to create a new object. A lot of the slowness in JS is caused by the _possibility_ of someone doing something stupid.
olliej
+6  A: 

I'll humbly disagree with Rebecca Murphey and vote for simplicity, for the first option.

I think regex is pretty fast
Machine code is even faster, but we don't use it.

the third option is confusing
It's only confusing if you're unfamiliar with the trick. (And for people not used to seeing regex to compare two strings, second option will be even more confusing.)

Nikita Rybak
A: 

For simplicity and readability the first will win every time. It might not be as fast, but who cares unless it is in a heavily run loop.

Good compilers should optimize things like this away.

Patrick
what are they optimising away? This is javascript is surprisingly difficult to optimise things out.
olliej
When you are comparing two items you could argue the readability is better in the first one but if you have 4 or more it is going to start to fall onto more than one line and readability becomes an issue.
Toby
@olliej: Lots of things are difficult to optimize in dynamic languages like JavaScript, but not `if (type == 'foo' || type == 'bar')` :-)
T.J. Crowder
@T. J. Crowder: what are you going to optimise out? You have to do the checks (note i am assuming that strings are interned so you can typically get away with a pointer compare)
olliej
I took a graduate level CS course where I optimized the heck out of a C++ program. My optimized program faired worse than what the compiler could do with "standard" code. Everyone in the class was the same way. Algorithm design is far more important then agonising over a single line or two. C++ is not Javascript, but the more vanilla your code is the easier for machine and human to work with.
Patrick
+1  A: 

The first case should really be done with === to avoid any type coercions, but depending on the number of alternatives you need to check it can become O(N), however depending on your code most JS engines will be able to a simple pointer check for the comparison.

In the second case you use a RegExp, and while RegExps are very fast, they tend to be slower for simple equality decisions than more direct equality comparisons. Simple string comparisons like yours are likely to be a pointer compare in a modern JS engine, but if you use a regexp the regexp must read every character.

The third case is more tricky -- if you do have a lot of values to check it may be faster, especially if you cache the object rather than repeatedly recreating it as it will simply be a hash lookup -- the exact performance of the lookup depends on the engine though.

I suspect a switch statement would beat the object literal case though.

Out of curiosity I made a test (which you can see here), the fastest approach (in a webkit nightly at least) seems to be a switch statement, followed by if, followed by the object, with regexp's last.

olliej
+1, I honestly don't understand how she can call case 2 or 3 "best practices" and I really hope we won't start seeing everyone doing simple comparisons with `regexp`
David Titarenco
+5  A: 

I just made a rudimentary benchmark and I'm honestly not sure how she got those results... http://jsbin.com/uzuxi4/2/edit

Regex seems to scale the best, but the first is by far the fastest on all modern browsers. The last is excruciatingly slow. I understand the complexity theory between the three, but in practice, it doesn't seem that she's correct.

Let alone the fact that the first also has the best readability, it also seems to be the fastest. I even nested loops to take advantage of any browser caching of literal tables or constants (to no avail).


Edit: It appears that when an object is explicitly created, she is indeed correct, however: http://jsbin.com/uzuxi4/4/edit

function __hash() {
  ...

  var type = 'bar';
  var testobj = { foo : 1, bar : 1 };
  var c = 0;
  for (i = 0; i < 1000; i++) {
    if (testobj[type]) {
      for (j = 0; j < 10000; j++) {
          if (testobj[type]) { c++; }
      }
    }
  }

  ...
}

We see that once the object has an internal reference, the seek time drops to about 500 ms which is probably the plateau. Object key lookup may be the best for larger data-sets, but in practice I don't really see it as a viable option for every-day use.

David Titarenco
She's ignoring the cost of what she's doing -- a [] access to an object is not cached in any modern engine, and of course you have to allocate that object (and then pay gc cost later).
olliej
My brief testing showed switch to be the fastest :D
olliej
Yes, but of course! Switch ftw :)
David Titarenco
A: 

Just wanted to weigh in here and remind everyone that this is an open-source book with contributions from many people! The section being discussed, indeed, is based on content provided by a community member. If you have suggestions for improving the section, by all means, please open an issue on the repository, or better, fork the repo and send me a pull request :)

That said, I have just set up a jsPerf test (http://jsperf.com/string-tests), and at least in Chrome, the results are the opposite of what the book says. I've opened an issue on the book, and will try to deal with this in the near future.

Finally, two things:

  • I want to echo what another commenter said: perf optimizations are fun to talk about, and while there are some that really do matter, many don't. It's important to keep perspective on how much -- or little -- of a difference stuff like this makes.
  • I also want to echo the commenter who said, essentially, that readability is in the eyes of the beholder. Something confusing to one person may be perfectly clear to another. I do believe we should strive for readability, but I think there's a happy medium. Reading code that was a bit perplexing to me at first opened my eyes to a lot of great techniques; I'd have hated if it had been written so the complete newb that I was at the time could understand it.
rmurphey