tags:

views:

219

answers:

4

I'm tempted to define my Python classes like this:

class MyClass(object):
    """my docstring"""

    msg = None
    a_variable = None
    some_dict = {}

    def __init__(self, msg):
        self.msg = msg

Is declaring the object variables (msg, a_variable, etc) at the top, like Java good or bad or indifferent? I know it's unnecessary, but still tempting to do.

+6  A: 

Defining variables in the class defintion like that makes the variable accessible between every instance of that class. In Java terms it is a bit like making the variable static. However, there are major differences as show below.

class MyClass(object):
    msg = "ABC"

print MyClass.msg     #prints ABC
a = MyClass()
print a.msg           #prints ABC
a.msg = "abc"
print a.msg           #prints abc
print MyClass.msg     #prints ABC
print a.__class__.msg #prints ABC

As seen from the above code, it is not quite the same thing, as while the variable can be access via self.msg, when it is assigned a value it is not assigned to the variable defined at class scope.

One of the disadvantage of doing it via the method you do is that it can lead to errors as it adds hidden state the the class. Say someone left out self.msg = "ABC" from the constructor (Or more realistically code was refactored and only one of the definitions was altered)

a = MyClass()
print a.msg   #prints ABC

#somewhere else in the program
MyClass.msg = "XYZ"

#now the same bit of code leads to a different result, despite the expectation that it
#leads to the same result.
a = MyClass()
print a.msg   #prints XYZ

Far better to avoid defining msg at the class level and then you avoid the issues:

class MyClass(object):
    pass

print MyClass.msg #AttributeError: type object 'MyClass' has no attribute 'msg'
Yacoby
+3  A: 

Careful. The two msg attributes are actually stored in two different dictionaries. One overshadows the other, but the clobbered msg attribute is still taking up space in a dictionary. So it goes unused and yet still takes up some memory.

class MyClass(object):    
    msg = 'FeeFiFoFum'   
    def __init__(self, msg):
        self.msg = msg

m=MyClass('Hi Lucy')

Notice that we have 'Hi Lucy' as the value.

print(m.__dict__)
# {'msg': 'Hi Lucy'}

Notice that MyClass's dict (accessed through m.__class__) still has FeeFiFoFum.

print(m.__class__.__dict__)
# {'__dict__': <attribute '__dict__' of 'MyClass' objects>, '__module__': '__main__', '__init__': <function __init__ at 0xb76ea1ec>, 'msg': 'FeeFiFoFum', 'some_dict': {}, '__weakref__': <attribute '__weakref__' of 'MyClass' objects>, '__doc__': 'my docstring', 'a_variable': None}

Another (perhaps simpler) way to see this:

print(m.msg)
# Hi Lucy
print(MyClass.msg)
# FeeFiFoFum
unutbu
I don't think "takes up some memory" is a notable issue in this case. There won't be more "FeeFiFoFum"'s if you create more MyClass instances. There will only ever be one of them. Not a big deal memory wise.
FogleBird
@FogleBird: I agree that the amount of memory wasted would be minimal, but why waste memory when you don't have to? Anyway, I mentioned the memory issue mainly to demonstrate that the class attribute and the instance attribute were entirely different things. Initializing the attribute at the class level does not initialize the instance attribute, so the OP's code wasn't doing what I think the OP thought it was doing.
unutbu
+5  A: 

Declaring variables directly inside the class definition makes them class variables instead of instance variables. Class variables are somewhat similar to static variables in Java and should be used like MyClass.a_variable. But they can also be used like self.a_variable, which is a problem because naive programmers can treat them as instance variables. Your "some_dict" variable, for example, would be shared by each instance of MyClass, so if you add a key "k" to it, that will be visible to any instance.

If you always remember to re-assign class variables, there's almost no difference to instance variables. Only the initial definition in MyClass will remain. But anyway, that's not good practice as you might run into trouble when not re-assigning those variables!

Better write the class like so:

class MyClass(object):
    """
    Some class
    """

    def __init__(self, msg):
        self.__msg = msg
        self.__a_variable = None
        self.__some_dict = {}

Using two underscores for "private" variables (pseudo-private!) is optional. If the variables should be public, just keep their names without the __ prefix.

AndiDog
The `__` prefix does not make a meaningfully private variable, just does some name-mangling (as explained in the linked documentation), which ends up making the class more annoying to subclass and test most of the time. Not much good Python code seems to use it these days, preferring the conventional single leading underscore to indicate internal attributes.
Mike Graham
@Mike Graham: I mentioned it was optional. Developers can decide that theirselves.
AndiDog
A: 

When you declare a class, Python will parse its code and put everything in the namespace of the class; then the class will be used as a kind of template for all objects derived from it - but any object will have its own copy of the reference.
Note that you always have a reference; as such, if you are able to alter the referenced object, the change will reflect into all places it is being used. However, the slot for the member data is unique for each instance, and therefore assigning it to a new object will not reflect to any other place it is being used.

Note: Michael Foord has a very nice blog entry on how class instantiation works; if you are interested in this topic, I suggest you that short reading.

Anyway, for all practical uses, there are two main differences between your two approaches:

  1. The name is already available at class level, and you can use it without instantiating a new object; this may sound neat for declaring constants in namespaces, but in many cases the module name may already be a good one.
  2. The name is added at class level - it means that you may not be able to mock it easily during unit tests, and that if you have any expensive operation, you get it at the very moment of the import.

Usually, reviewing code I see members declared at class level with a bit of suspicion; there are a lot of good usecases for them, but it is also quite likely they are there as a kind of habit from previous experiences with other programming languages.

Roberto Liffredo
"the class will be used as a kind of template for all objects derived from it - but any object will have its own copy of the data" is not correct.
Craig McQueen
-1: setting a class member like the OP's `msg`, `a_variable` and `some_dict` definitely makes then shared between all objects of the class unless the object's constructor copies and replaces the members. What example have you seen where this is different?
Jarret Hardie
You are both right. I was focusing only on the fact that the member slot is not shared, but the fact it is holding a reference should also be part of the explanation. Hopefully, it should be clearer, now.
Roberto Liffredo
So, what is the answer to the question? Is it good, bad or indifferent?
OscarRyz
It depends on what you want to do. If you do it just because you have the habit from other languages, it may be bad. If you do it because you want to have it in that way, it is very good.
Roberto Liffredo