views:

320

answers:

7

I have a list in Python that I generate as part of the program. I have a strong assumption that these are all different, and I check this with an assertion.

This is the way I do it now:

If there are two elements:

try:
    assert(x[0] != x[1])
except:
    print debug_info
    raise Exception("throw to caller")

If there are three:

try:
    assert(x[0] != x[1])
    assert(x[0] != x[2])
    assert(x[1] != x[2])
except:
    print debug_info
    raise Exception("throw to caller")

And if I ever have to do this with four elements I'll go crazy.

Is there a better way to ensure that all the elements of the list are unique?

+22  A: 

Maybe something like this:

if len(x) == len(set(x)):
    print "all elements are unique"
else
    print "elements are not unique"
mikez302
Very clever answer
foosion
You could just store them in a set in the first place to ensure that they're all unique. Or store them in a set, but before adding to the set check for membership. But this definitely works if you don't have control over the input format.
Chris Lutz
sets don't necessarily retain the order, which might be important.
Colin Coghill
Why is that important for this purpose. All I am trying to do is see if the list has the same number of elements after removing duplicates.
mikez302
+7  A: 

How about this:

if len(x) != len(set(x)):
    raise Exception("throw to caller")

This assumes that elements in x are hashable.

scrible
+2  A: 

Hopefully all the items in your sequence are immutable -- if not, you will not be able to call set on the sequence.

>>> set( ([1,2], [3,4]) )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

If you do have mutable items, you can't hash the items and you will pretty much have to repeatedly check through the list:

def isUnique(lst):
    for i,v in enumerate(lst):
        if v in lst[i+1:]:
            return False
    return True

>>> isUnique( ([1,2], [3,4]) )
True
>>> isUnique( ([1,2], [3,4], [1,2]) )
False
Mark Rushakoff
+1  A: 

As you build the list you can check to see if the value already exists, e.g:

if x in y:
     raise Exception("Value %s already in y" % x)
else:
     y.append(x)

the benefit of this is that the clashing variable will be reported.

A: 

You could process the list to create a known-to-be-unique copy:

def make_unique(seq): 
    t = type(seq) 
    seen = set()
    return t(c for c in seq if not (c in seen or seen.add(c)))

Or if the seq elements are not hashable:

def unique1(seq):
    t = type(seq) 
    seen = [] 
    return t(c for c in seq if not (c in seen or seen.append(c)))

And this will keep the items in order (omitting duplicates, of course).

Paul McGuire
+11  A: 

The most popular answers are O(N) (good!-) but, as @Paul and @Mark point out, they require the list's items to be hashable. Both @Paul and @Mark's proposed approaches for unhashable items are general but take O(N squared) -- i.e., a lot.

If your list's items are not hashable but are comparable, you can do better... here's an approach that always work as fast as feasible given the nature of the list's items.

import itertools

def allunique(L):
  # first try sets -- fastest, if all items are hashable
  try:
    return len(L) == len(set(L))
  except TypeError:
    pass
  # next, try sort -- second fastest, if items are comparable
  try:
    L1 = sorted(L)
  except TypeError:
    pass
  else:
    return all(len(list(g))==1 for k, g in itertools.groupby(L1))
  # fall back to the slowest but most general approach
  return all(v not in L[i+1:] for i, L in enumerate(L))

This is O(N) where feasible (all items hashable), O(N log N) as the most frequent fallback (some items unhashable, but all comparable), O(N squared) where inevitable (some items unhashable, e.g. dicts, and some non-comparable, e.g. complex numbers).

Inspiration for this code comes from an old recipe by the great Tim Peters, which differed by actually producing a list of unique items (and also was so far ago that set was not around -- it had to use a dict...!-), but basically faced identical issues.

Alex Martelli
I miss Tim. Must invite him to lunch again soon.
Heh, I miss him more (it's been longer!), but I guess it's kind of impractical for either of us to fly coast to coast just for a lunch;-).
Alex Martelli
A: 

I would use this:

mylist = [1,2,3,4]
is_unique = all(mylist.count(x) == 1 for x in mylist)
Jeremy
`O(n**2)`, isn't?
SilentGhost