views:

181

answers:

7

I'm fairly new to actual programming languages, and Python is my first one. I know my way around Linux a bit, enough to get a summer job with it (I'm still in high school), and on the job, I have a lot of free time which I'm using to learn Python.

One thing's been getting me though. What exactly is different in Python when you have expressions such as

x.__add__(y) <==> x+y
x.__getattribute__('foo') <==> x.foo

I know what methods do and stuff, and I get what they do, but my question is: How are those double underscore methods above different from their simpler looking equivalents?

P.S., I don't mind being lectured on programming history, in fact, I find it very useful to know :) If these are mainly historical aspects of Python, feel free to start rambling.

A: 

Look it is well explained here. http://stackoverflow.com/questions/1301346/the-meaning-of-a-single-and-a-double-underscore-before-an-object-name-in-python

juanefren
-1: He doesn't ask about name mangling or "private" members, he asks about special method names.
delnan
+1  A: 

When you start a method with two underscores (and no trailing underscores), Python's name mangling rules are applied. This is a way to loosely simulate the private keyword from other OO languages such as C++ and Java. (Even so, the method is still technically not private in the way that Java and C++ methods are private, but it is "harder to get at" from outside the instance.)

Methods with two leading and two trailing underscores are considered to be "built-in" methods, that is, they're used by the interpreter and are generally the concrete implementations of overloaded operators or other built-in functionality.

mipadi
That's good to know, but Python is my first programming language. I don't know what "private" is or name mangling, as delnan posted in a comment below. Thanks though, this answer will be much better for me once I do.
Andrew
From what I hear, you don't typically use the `__` lead to do private, but rather you use it when your function might step on the namespace of another. With that in mind, I've seen people say to use a single underscore to mark it as an "implementation detail".
Daenyth
+1  A: 

You can find some information here:

http://docs.python.org/reference/lexical_analysis.html#reserved-classes-of-identifiers

Maciej Kucharz
+1  A: 

Well, power for the programmer is good, so there should be a way to customize behaviour. Like operator overloading (__add__, __div__, __ge__, ...), attribute access (__getattribute__, __getattr__ (those two are differnt), __delattr__, ...) etc. In many cases, like operators, the usual syntax maps 1:1 to the respective method. In other cases, there is a special procedure which at some point involves calling the respective method - for example, __getattr__ is only called if the object doesn't have the requested attribute and __getattribute__ is not implemented or raised AttributeError. And some of them are really advanced topics that get you deeeeep into the object system's guts and are rarely needed. So no need to learn them all, just consult the reference when you need/want to know. Speaking of reference, here it is.

delnan
You're pretty good at explaining things, thanks! One question though, can you name a common case (perhaps a link to a code example) where these double underscore methods do something different than the operators?
Andrew
Huh? The special methods that are operators are just called right away (afaik). Of course there are many others that don't represent operators, like the `__getattribute__` we both already named or `__enter__`/`__exit__` for with statements (`with open("file") as f: do_stuff(f)`). In these cases, there are some more complex rules, and the documentation knows them better than I.
delnan
Ah, I see. This stuff's still a bit confusing to me, but you really helped clear most of it up. Thank you, delnan.
Andrew
@S.Lott: Wat? It's right there, the last three words.
delnan
@delnan: Too hidden. Should have been first or more prominent.
S.Lott
+1  A: 

They are used to specify that the Python interpreter should use them in specific situations.

E.g., the __add__ function allows the + operator to work for custom classes. Otherwise you will get some sort of not defined error when attempting to add.

Paul Nathan
Nice and concise. Where else might they show up commonly in Python besides custom classes?
Andrew
A: 

From an historical perspective, leading underscores have often been used as a method for indicating to the programmer that the names are to be considered internal to the package/module/library that defines them. In languages which do not provide good support for private namespaces, using underscores is a convention to emulate that. In Python, when you define a method named '__foo__' the maintenance programmer knows from the name that something special is going on which is not happening with a method named 'foo'. If Python had choosen to use 'add' as the internal method to overload '+', then you could never have a class with a method 'add' without causing much confusion. The underscores serve as a cue that some magic will happen.

William Pursell
A: 

Here is the creator of Python explaining it:

... rather than devising a new syntax for special kinds of class methods (such as initializers and destructors), I decided that these features could be handled by simply requiring the user to implement methods with special names such as _init_, _del_, and so forth. This naming convention was taken from C where identifiers starting with underscores are reserved by the compiler and often have special meaning (e.g., macros such as _FILE_ in the C preprocessor).

...

I also used this technique to allow user classes to redefine the behavior of Python's operators. As previously noted, Python is implemented in C and uses tables of function pointers to implement various capabilities of built-in objects (e.g., “get attribute”, “add” and “call”). To allow these capabilities to be defined in user-defined classes, I mapped the various function pointers to special method names such as _getattr_, _add_, and _call_. There is a direct correspondence between these names and the tables of function pointers one has to define when implementing new Python objects in C.

Steven Rumbalski