ansaurus

Question

What is the correct (or best) way to subclass the Python set class, adding a new instance variable?

Answer 1

A:

set1 | set2 is an operation that won't modify either existing set, but return a new set instead. The new set is created and returned. There is no way to make it automatically copy arbritary attributes from one or both of the sets to the newly created set, without customizing the | operator yourself by defining the __or__ method.

class MySet(set):
    def __init__(self, *args, **kwds):
        super(MySet, self).__init__(*args, **kwds)
        self.foo = 'nothing'
    def __or__(self, other):
        result = super(MySet, self).__or__(other)
        result.foo = self.foo + "|" + other.foo
        return result

r = MySet('abc')
r.foo = 'bar'
s = MySet('cde')
s.foo = 'baz'

t = r | s

print r, s, t
print r.foo, s.foo, t.foo

Prints:

MySet(['a', 'c', 'b']) MySet(['c', 'e', 'd']) MySet(['a', 'c', 'b', 'e', 'd'])
bar baz bar|baz

nosklo 2009-04-28 15:29:24

This is what i suspected. In this case, I'll have to override __and__, __or__, __rand__, __ror__, __rsub__, __rxor__, __sub__, __xor__, add, copy, difference, intersection, symmetric_difference, and union. Have I missed any? To be honest, I was looking for something with the simple generality of the 2.5 solution I listed above... but a negative answer is good too. It does seem a little like a bug to me.

rog 2009-04-28 15:45:21

Answer 2

A:

For me this works perfectly using Python 2.5.2 on Win32. Using you class definition and the following test:

f = Fooset([1,2,4])
s = sets.Set((5,6,7))
print f, f.foo
f.foo = 'bar'
print f, f.foo
g = f | s
print g, g.foo
assert( (f | f).foo == 'bar')

I get this output, which is what I expect:

Fooset([1, 2, 4]) default
Fooset([1, 2, 4]) bar
Fooset([1, 2, 4, 5, 6, 7]) bar

Ber 2009-04-28 15:39:34

yes, this works with 2.5.2, but can you make it work with the built-in set type in python 2.6?

rog 2009-04-28 15:41:37

since you had `import sets` in your code, and did not mention 2.6, I assumed you'd be using the sets.py module. If this is no longer available in 2.6 your likely out of luck

Ber 2009-04-28 18:52:04

Answer 3

+1 A:

It looks like set bypasses __init__ in the c code. However you will end an instance of Fooset, it just won't have had a chance to copy the field.

Apart from overriding the methods that return new sets I'm not sure you can do too much in this case. Set is clearly built for a certain amount of speed, so does a lot of work in c.

John Montgomery 2009-04-28 15:59:59

*sigh*. thanks. that was my reading of the C code too, but i'm a python newbie so thought it was worth asking. i'd forgotten my reasons for disliking subclassing in general - the "external" subclass becomes dependent on the unpublished internal implementation details of its superclass.

rog 2009-04-28 16:14:31

Answer 4

A:

Assuming the other answers are correct, and overriding all the methods is the only way to do this, here's my attempt at a moderately elegant way of doing this. If more instance variables are added, only one piece of code needs to change. Unfortunately if a new binary operator is added to the set object, this code will break, but I don't think there's a way to avoid that. Comments welcome!

def foocopy(f):
 def cf(self, new):
  r = f(self, new)
  r.foo = self.foo
  return r
 return cf

class Fooset(set):
 def __init__(self, s = []):
  set.__init__(self, s)
  if isinstance(s, Fooset):
   self.foo = s.foo
  else:
   self.foo = 'default'

 def copy(self):
  x = set.copy(self)
  x.foo = self.foo
  return x

 @foocopy
 def __and__(self, x):
  return set.__and__(self, x)

 @foocopy
 def __or__(self, x):
  return set.__or__(self, x)

 @foocopy
 def __rand__(self, x):
  return set.__rand__(self, x)

 @foocopy
 def __ror__(self, x):
  return set.__ror__(self, x)

 @foocopy
 def __rsub__(self, x):
  return set.__rsub__(self, x)

 @foocopy
 def __rxor__(self, x):
  return set.__rxor__(self, x)

 @foocopy
 def __sub__(self, x):
  return set.__sub__(self, x)

 @foocopy
 def __xor__(self, x):
  return set.__xor__(self, x)

 @foocopy
 def difference(self, x):
  return set.difference(self, x)

 @foocopy
 def intersection(self, x):
  return set.intersection(self, x)

 @foocopy
 def symmetric_difference(self, x):
  return set.symmetric_difference(self, x)

 @foocopy
 def union(self, x):
  return set.union(self, x)


f = Fooset([1,2,4])
f.foo = 'bar'
assert( (f | f).foo == 'bar')

rog 2009-04-28 16:31:30

You've got some infinite recursion in the copy method. x = self.copy() should be x = super(Fooset,self).copy()

John Montgomery 2009-04-28 17:50:02

yes, you're right. is using super() better than explicitly mentioning the superclass?

rog 2009-04-28 18:20:01

Answer 5

+2 A:

My favorite way to wrap methods of a built-in collection:

class Fooset(set):
    def __init__(self, s=(), foo=None):
        super(Fooset,self).__init__(s)
        if foo is None and hasattr(s, 'foo'):
            foo = s.foo
        self.foo = foo



    @classmethod
    def _wrap_methods(cls, names):
        def wrap_method_closure(name):
            def inner(self, *args):
                result = getattr(super(cls, self), name)(*args)
                if isinstance(result, set) and not hasattr(result, 'foo'):
                    result = cls(result, foo=self.foo)
                return result
            inner.fn_name = name
            setattr(cls, name, inner)
        for name in names:
            wrap_method_closure(name)

Fooset._wrap_methods(['__ror__', 'difference_update', '__isub__', 
    'symmetric_difference', '__rsub__', '__and__', '__rand__', 'intersection',
    'difference', '__iand__', 'union', '__ixor__', 
    'symmetric_difference_update', '__or__', 'copy', '__rxor__',
    'intersection_update', '__xor__', '__ior__', '__sub__',
])

Essentially the same thing you're doing in your own answer, but with fewer loc. It's also easy to put in a metaclass if you want to do the same thing with lists and dicts as well.

Matthew Marshall 2009-04-30 01:01:00

that's a useful contribution, thanks. it doesn't look like you're gaining much by making _wrap_methods a class method rather than a function - is that purely for the modularity it gives?

rog 2009-05-01 11:39:36

ansaurus

tags:

views:

answers:

What is the correct (or best) way to subclass the Python set class, adding a new instance variable?

related questions