tags:

views:

197

answers:

10

I have a list of an arbitrary number of lists, for instance:

[[1,2,3], [3,4,5], [5,6,7], [7,8,9]]

Now I would like a list containing all elements that are present in more than one list:

[3,5,7]

How would I do that?

Thanks!

A: 

flatten, sort, 1 for loop comparing numbers before and after

joetsuihk
This would fail if a number occurs twice in the same list.
Max Shawabkeh
If you flatten first, you would get a "false positive" on input such as: [[1, 1], [2, 2]]
truppo
+11  A: 

The same way as you'd do it by hand:

seen = set()
repeated = set()
for l in list_of_lists:
  for i in set(l):
    if i in seen:
      repeated.add(i)
    else:
      seen.add(i)

By the way, here's the one liner (without counting the import) that some people were seeking (should be less efficient than the other approach)

from itertools import *
reduce(set.union, (starmap(set.intersection, combinations(map(set, ll), 2))))
fortran
Great use of set!
Khelben
Just be careful, as `l` needs also to be 'setted', or you'll get false positives in [[1,1],[2,2]]
Khelben
+1 for readability
Till Backhaus
as @Khelben says you should change `for i in l:` in `for i in set(l):`
Andrea Ambu
With @Khelben's correction applied that solution worked well. Thanks!
mathias
@Khelben thanks for pointing out, I was already thinking in "set mode" instead of lists ^_^
fortran
A: 

You can use a set see http://docs.python.org/library/stdtypes.html#set

luc
+1  A: 

You can use a dictionary to get the count of each

from collections import defaultdict

init_list = [[1,2,3], [3,4,5], [5,6,7], [7,8,9]]
#defaultdict, every new key will have a int(0) as default value
d = defaultdict(int)
for values in init_list:
  #Transform each list in a set to avoid false positives like [[1,1],[2,2]]
  for v in set(values):
    d[v] += 1

#Get only the ones that are more than once
final_list = [ value for value,number in d.items() if number > 1 ]
Khelben
Looks like a good job for `collections.defaultdict`.
muhuk
@muhuk. Yes! you're absolutely right! I've changed the code. The standard library of Python is really big and full of useful things...
Khelben
A: 
l=[[1,2,3], [3,4,5], [5,6,7], [7,8,9]]
d={}
for x in l:
    for y in x:
        if not d.has_key(y):
            d[y]=0
        d[y]+=1
[x for x,y in d.iteritems() if y>1]
Till Backhaus
+2  A: 

reference: http://docs.python.org/library/stdtypes.html#set

#!/usr/bin/python

ll = [[1,2,3], [3,4,5], [5,6,7], [7,8,9]]
ls = [set(l) for l in ll]

su = ls[0]  #union
ssd = ls[0] #symmetric_difference
for s in ls[1:]:
  su = su.union(s)
  ssd = ssd.symmetric_difference(s)

result = su.difference(ssd)
print list(result)

=>

[3, 5, 7]

revise and adopt FP,

ll = [[1,2,3], [3,4,5], [5,6,7], [7,8,9]]

u = reduce(set.union, map(set, ll))
sd = reduce(set.symmetric_difference, map(set, ll))
print u - sd

=>

[3, 5, 7]
Dyno Fu
A: 

Try this:

data = [[1,2,3], [3,4,5], [5,6,7], [7,8,9], [1,2,3]]

res = set()

for i in data:
    for j in data:
        if i is not j:
            res |= set(i) & set(j)

print res
liwp
+4  A: 

Cleanest way would probably be to use reduce:

def findCommon(L):
    def R(a, b, seen=set()):
        a.update(b & seen)
        seen.update(b)
        return a
    return reduce(R, map(set, L), set())

result = findCommon([[1,2,3], [3,4,5], [5,6,7], [7,8,9]])

Result is a set, but just do list(result) if you really need a list.

truppo
+1 for functional programming.
Dyno Fu
@Dyno the `seen` variable contains a "hidden" mutable state between calls... functional programming shouldn't rely on side effects (and it's a peril if they are not explicit, I bet that most of people wouldn't notice that seen is playing the role of a "static" variable instead of a regular argument)
fortran
@fortran: The typical usage of functions such as R is one-time, since they are placed inside another function and not at module level. I'll update the answer to make that clearer.
truppo
@truppo I know that, but even with that premises an incautious reader could be fooled... I think it would be cleaner to move `seen` outside the parameter list of `R` and make it a local variable of `findCommon`.
fortran
+1  A: 
>>> sets = [[1,2,3], [3,4,5], [5,6,7], [7,8,9]]
>>> seen = set()
>>> duplicates = set()
>>> 
>>> for subset in map(set, sets) :
...     duplicates |= (subset & seen)
...     seen |= subset
... 
>>> print(duplicates)
set([3, 5, 7])
>>> 

I tried for a one-line answer with map/reduce, but can't quite get it yet.

sykora
A: 

Here is my go:

seen = set()
result = set()
for s in map(set, [[1,2,3], [3,4,5], [5,6,7], [7,8,9]]):
    result.update(s & seen)
    seen.update(s)
print result

This prints:

set([3, 5, 7])
Daren Thomas