views:

288

answers:

7

I have a set of String values in Javascript, and I need to write a function that detects if another specific String value belongs to this set or not. What is the fastest way to achieve this? Is it all right to put the set of values into an array, and then write a function that searches through the array? I think if I keep the values sorted and do a binary search, it should work fast enough. Or is there some other smart way of doing this, which can work faster?

Thanks.

A: 

Using a hash table might be a quicker option.

Whatever option you go for its definitely worth testing out its performance against the alternatives you consider.

Chris Kimpton
+4  A: 

You can use an object like so:

// prepare a mock-up object
setOfValues = {};
for (var i = 0; i < 100; i++)
  setOfValues["example value " + i] = true;

// check for existence
if (setOfValues["example value 99"]);   // true
if (setOfValues["example value 101"]);  // undefined, essentially: false

This takes advantage of the fact that objects are implemented as associative arrays. How fast that is depends on your data and the JavaScript engine implementation, but you can do some performance testing easily to compare against other variants of doing it.

If a value can occur more than once in your set and the "how often" is important to you, you can also use an incrementing number in place of the boolean I used for my example.

Tomalak
This doesn't work. setOfValues[x] where x is not in the set will not evaluate to undefined, it will produce an error. What you want is: "x in setOfValues" to test for membership.
Simon Howard
It will evaluate to undefined. It will not result in an error. Try it out.
Tomalak
A: 

Depends on how much values there are.

If there are a few values (less than 10 to 50), searching through the array may be ok. A hash table might be overkill.

If you have lots of values, a hash table is the best option. It requires less work than sorting the values and doing a binary search.

Burkhard
i thought everything in JS was a hash table, an arry is just 0=>firstentry, 1=>secondentry isnt it?
Andrew Bullock
@Trull: I doubt it. Array is different from Object, and probably optimized for performance, although it will depend on implementation.
PhiLho
A: 

You might also want to test out your solution on all the browsers you are developing against. different browsers have different JS implementations with different optimizations

AndreasKnudsen
+7  A: 

Use a hash table, and do this:

// Initialise the set

mySet = {};

// Add to the set

mySet["some string value"] = true;

...

// Test if a value is in the set:

if (testValue in mySet) {
     alert(testValue + " is in the set");
} else {
     alert(testValue + " is not in the set");
}
Simon Howard
Using the "in" operator is probably the more elegant way of doing it. +1
Tomalak
+4  A: 

A comment to the above mentioned hash solutions. Actually the {} creates an object (also mentioned above) which can lead to some side-effects. One of them is that your "hash" is already pre-populated with the default object methods.

So "toString" in setOfValues will be true (at least in Firefox). You can prepend another character e.g. "." to your strings to work around this problem or use the Hash object provided by the "prototype" library.

Ralf
Thanks for noting this. Yet another reason for me to hate JavaScript.
Simon Howard
A: 

A possible way, particularly efficient if the set is immutable, but is still usable with a variable set:

var haystack = "monday tuesday wednesday thursday friday saturday sunday";
var needle = "Friday";
if (haystack.indexOf(needle.toLowerCase()) >= 0) alert("Found!");

Of course, you might need to change the separator depending on the strings you have to put there...

A more robust variant can include bounds to ensure neither "day wed" nor "day" can match positively:

var haystack = "!monday!tuesday!wednesday!thursday!friday!saturday!sunday!";
var needle = "Friday";
if (haystack.indexOf('!' + needle.toLowerCase() + '!') >= 0) alert("Found!");

Might be not needed if the input is sure (eg. out of database, etc.).

I used that in a Greasemonkey script, with the advantage of using the haystack directly out of GM's storage.

PhiLho