views:

99

answers:

7

In Python, what is the "one [...] obvious way" to add all items of an iterable to an extant set?

A: 

I love this way:

set =  [item for item in items]
dolma33
@dolma, I don't want to *create* a new set, but add to an extant set. Sorry if that was unclear; I've edited the question to clarify.
Ian Mackinnon
@dolma, also, (a) you can do `s = set(items)` and (b) your example redefines a built-in :)
Ian Mackinnon
then you simply can do `set.extend([item for item in items])`, I guess...
dolma33
That's a list, not a set.
FogleBird
oops that's true... so it would simply be `myset.update(items)` ...
dolma33
I was totally thinking about lists and not about sets...
dolma33
+1  A: 

Use list comprehension.

Short circuiting the creation of iterable using a list for example :)

>>> x = [1, 2, 3, 4]
>>> 
>>> k = x.__iter__()
>>> k
<listiterator object at 0x100517490>
>>> l = [y for y in k]
>>> l
[1, 2, 3, 4]
>>> 
>>> z = Set([1,2])
>>> z.update(l)
>>> z
set([1, 2, 3, 4])
>>> 

[Edit: missed the set part of question]

pyfunc
I don't see any sets? Am I missing something?
Ian Mackinnon
@Ian Mackinnon: Aah! Missed that point completely. Thanks.
pyfunc
+13  A: 

by set, do you mean set?

>>> foo = set(range(0, 4))
>>> foo
set([0, 1, 2, 3])
>>> foo.update(range(2, 6))
>>> foo
set([0, 1, 2, 3, 4, 5])
TokenMacGuy
Yes, I didn't expect that to be ambiguous, but it seems it is :) Update is the very thing. Thank you!
Ian Mackinnon
Probably would be a more lucid example if you didn't update with items that are a superset of what was there before
Nick T
Just looked back at my interpreter session and I actually tried this, but thought that it had added the whole list as an element of the set because of the square brackets in the representation of the set. I had never noticed before that they're represented like that.
Ian Mackinnon
+1  A: 
for item in items:
   extant_set.add(item)

For the record, I think the assertion that "There should be one-- and preferably only one --obvious way to do it." is bogus. It makes an assumption that many technical minded people make, that everyone thinks alike. What is obvious to one person is not so obvious to another.

I would argue that my proposed solution is clearly readable, and does what you ask. I don't believe there are any performance hits involved with it--though I admit I might be missing something. But despite all of that, it might not be obvious and preferable to another developer.

jaydel
Argh! The for loop being on one line like that is formatting in my answer--I would never do that. Ever.
jaydel
sets use `add` not `append`
FogleBird
You are absolutely correct. I edited the post to repair my damage. Thanks :)
jaydel
@jaydel: You are missing the point that `aset.update(iterable)` loops at C speed whereas `for item in iterable: aset.add(item)` loops at Python speed, with a method lookup and a method call (aarrgghh!!) per item.
John Machin
@jaydel: See my answer.
John Machin
A: 

Sets have the add method too:

>>> s=set([1,2,3,4])
>>> for n in range(10):
        s.add(n)
>>> s
set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
bigjim
+2  A: 

You can use the set() function to convert an iterable into a set, and then use standard set update operator (|=) to add the unique values from your new set into the existing one.

>>> a = { 1, 2, 3 }
>>> b = ( 3, 4, 5 )
>>> a |= set(b)
>>> a
set([1, 2, 3, 4, 5])
gbc
Using `.update` has the benefit that the argument can be any iterable —not necessarily a set— unlike the RHS of the `|=` operator in your example.
ΤΖΩΤΖΙΟΥ
Good point. It's just an aesthetic choice since set() can convert an iterable into a set, but the number of keystrokes are the same.
gbc
+1  A: 

For the benefit of anyone who might believe e.g. that doing aset.add() in a loop would have performance competitive with doing aset.update(), here's an example of how you can test your beliefs quickly before going public:

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "a.update(it)"
1000 loops, best of 3: 1 msec per loop

>\python27\python -mtimeit -s"it=xrange(10000);a=set(xrange(100))" "for i in it:a.add(i)"
100 loops, best of 3: 3.16 msec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "a.update(it)"
100 loops, best of 3: 1.97 msec per loop

>\python27\python -mtimeit -s"it=xrange(20000);a=set(xrange(100))" "for i in it:a.add(i)"
100 loops, best of 3: 6.39 msec per loop

Looks like the cost per item of the loop approach is over THREE times that of the update approach.

John Machin