views:

6352

answers:

9

I need to check a javascript array to see if there are any duplicate values. What's the easiest way to do this? I just need to find what the duplicated values are - I don't actually need their indexes or how many times they are duplicated.

I know I can loop through the array and check all the other values for a match, but it seems like there should be an easier way. Any ideas? Thanks!

+2  A: 

You could sort the array and then run through it and then see if the next (or previous) index is the same as the current. Assuming your sort algorithm is good, this should be less than O(n^2):

var arr = [9, 9, 111, 2, 3, 4, 4, 5, 7];
var sorted_arr = arr.sort(); // You can define the comparing function here. JS default uses a crappy string compare.
var results = [];
for (var i = 0; i < arr.length - 1; i += 1) {
 if (sorted_arr[i + 1] == sorted_arr[i]) {
  results.push(sorted_arr[i]);
 }
}

alert(results);
swilliams
Emil, your edit was correct, thanks. :) I just say "Oh squared".
swilliams
I think everybody understand what you meant, anyway, but nevertheless. :) Good solution btw. +1
Emil H
"Assuming your sort algorithm is good, this should be less than O^2". Specifically, it could be O(n*log(n)).
ESRogs
And this is the best you can do in the comparison model.
SplittingField
@swilliams: What if the array is an array of html elements, how do you sort them 1st?!
Marco Demajo
This script doesn't work so well with more than 2 duplicates (e.g. `arr = [9, 9, 9, 111, 2, 3, 3, 3, 4, 4, 5, 7];`
fudgey
A: 

The Prototype library has a uniq function, which returns the array without the dupes. That's only half of the work though.

Diodeus
+2  A: 

You can add this function, or tweak it and add it to Javascript's Array prototype:

Array.prototype.unique = function () {
    var r = new Array();
    o:for(var i = 0, n = this.length; i < n; i++)
    {
     for(var x = 0, y = r.length; x < y; x++)
     {
      if(r[x]==this[i])
      {
                alert('this is a DUPE!');
       continue o;
      }
     }
     r[r.length] = this[i];
    }
    return r;
}

var arr = [1,2,2,3,3,4,5,6,2,3,7,8,5,9];
var unique = arr.unique();
alert(unique);
karim79
+1: great code, didn't know about continue operator could place a label after it.
Marco Demajo
+8  A: 

If you want to elimate the duplicates, try this great solution:

function eliminateDuplicates(arr) {
  var i,
      len=arr.length,
      out=[],
      obj={};

  for (i=0;i<len;i++) {
    obj[arr[i]]=0;
  }
  for (i in obj) {
    out.push(i);
  }
  return out;
}

Its one of the greatest code snippets for JavaScript i've seen. The original is published here: http://dreaminginjavascript.wordpress.com/2008/08/22/eliminating-duplicates/

Raphael Montanaro
That is good code, but unfortunately it doesn't do what I'm asking for.
Scott Saunders
The code above (which is mine--that's my blog) gets you pretty close. A small tweak and you're there.First of all, you can see if arr.length and out.length are the same. If they are the same, there are no duplicated elements.But you want a little more. If you want to "catch" the dupes as they happen, check to see if the length of the array increases after the obj[arr[i]]=0 line. Nifty, eh? :-)Thanks for the nice words, Raphael Montanaro.
Nosredna
@Raphael Montanaro: Pretty clever trick, sadly works only for an Array of strings/numbers, you can't remove dupicates with this function if it's an Array of objects. Moreover even on a simple Array of strings if one string simply contains a space like "hello babe" your code would crash miserably.
Marco Demajo
Also turns Numbers into Strings, which may or may not be acceptable depending on what you're doing. Nice trick however as long as one understands the gotchas
Infinity
+1  A: 

This should get you what you want, Just the duplicates.

function find_duplicates(arr) {
  var len=arr.length,
      out=[],
      counts={};

  for (var i=0;i<len;i++) {
    var item = arr[i];
    var count = counts[item];
    counts[item] = counts[item] >= 1 ? counts[item] + 1 : 1;
  }

  for (var item in counts) {
    if(counts[item] > 1)
      out.push(item);
  }

  return out;
}

find_duplicates(['one',2,3,4,4,4,5,6,7,7,7,'pig','one']); // -> ['one',4,7] in no particular order.
Daniel Beardsley
I verified this BTW and it works.
Daniel Beardsley
var count is not used..
vsync
A: 

Just to add some theory to the above.

Finding duplicates has a lower bound of O(n*log(n) in the comparison model. SO theoretically, you cannot do any better than first sorting then going through the list sequentially removing any duplicates you find.

If you want to find the duplicates in linear (O(n)) expected time, you could hash each element of the list; if there is a collision, remove/label it as a duplicate, and continue.

SplittingField
Agreed. The only reason to try different approaches here is that the speed depends on how well various things are implemented in the runtime. And that's going to vary browser-to-browser. For short lists, it probably doesn't matter much how you solve the problem. For large arrays, it does.
Nosredna
+1  A: 

The following function (a variation of the eliminateDuplicates function already mentioned) seems to do the trick, returning test2,1,7,5 for the input ["test", "test2", "test2", 1, 1, 1, 2, 3, 4, 5, 6, 7, 7, 10, 22, 43, 1, 5, 8]

Note that the problem is stranger in JavaScript than in most other languages, because a JavaScript array can hold just about anything. Note that solutions that use sorting might need to provide an appropriate sorting function--I haven't tried that route yet.

This particular implementation works for (at least) strings and numbers.

function findDuplicates(arr) {
 var i,
     len=arr.length,
     out=[],
     obj={};

 for (i=0;i<len;i++) {
  if (obj[arr[i]] != null) {
   if (!obj[arr[i]]) {
    out.push(arr[i]);
    obj[arr[i]] = 1;
   }
  } else {
   obj[arr[i]] = 0;   
  }
 }
 return out;
}
Nosredna
A: 

/* The indexOf method of the Array object is useful for comparing array items. IE is the only major browser that does not natively support it, but it is easy to implement: */

Array.prototype.indexOf= Array.prototype.indexOf || function(what, i){
    i= i || 0;
    var L= this.length;
    while(i<L){
     if(this[i]=== what) return i;
     ++i;
    }
    return -1;
}

function getarrayduplicates(arg){
    var itm, A= arg.slice(0, arg.length), dups= [];
    while(A.length){
     itm= A.shift();
     if(A.indexOf(itm)!= -1 && dups.indexOf(itm)== -1){
      dups[dups.length]= itm;
     }
    }
    return dups;
}

var a1= [1, 22, 3, 2, 2, 3, 3, 4, 1, 22, 7, 8, 9];

alert(getarrayduplicates(a1));

For very large arrays, it can be faster to remove the duplicates from the array as they are found, so that they will not be looked at again:

function getarrayduplicates(arg){
    var itm, A= arg.slice(0, arg.length), dups= [];
    while(A.length){
     itm= A.shift();
     if(A.indexOf(itm)!= -1){
      dups[dups.length]= itm;
      while(A.indexOf(itm)!= -1){
       A.splice(A.indexOf(itm), 1);
      }
     }
    }
    return dups;
}
kennebec
A: 

This is what I use, and I include the indexOf for browsers who doesn't support it Nativity.
Its always good to have it in your code :)
In this example, I merge 2 arrays and eliminate duplicates in the result.

if (!Array.prototype.indexOf){
   Array.prototype.indexOf = function(elt /*, from*/){
     var len = this.length >>> 0;

     var from = Number(arguments[1]) || 0;
     from = (from < 0) ? Math.ceil(from) : Math.floor(from);
     if (from < 0)
        from += len;

     for (; from < len; from++){
        if (from in this && this[from] === elt)
           return from;
     }
     return -1;
  };
}

Array.prototype.unique = function () {
    var r = [];
    for(var i = 0; i < this.length; i++){
       if( r.indexOf(this[i]) === -1)
          r.push(this[i]);
    }
    return r;
}

var arr = [1,2,2,3,3,4,5,6,2,3,7,8,5,9];
var arr2 = [1,2,511,12,50];
var unique = arr.concat(arr2).unique();
alert(unique);

if you use jQuery, you could use its 'inArray' function like this:

if( $.inArray(this[i], r) == -1 )

instead of adding the 'Array.prototype.indexOf'

vsync