views:

347

answers:

3

The standard way of doing singletons in Python is

class Singleton(object):
    _instance = None
    def __new__(cls, *args, **kwargs):
        if not cls._instance:
            cls._instance = super(Singleton, cls).__new__(cls, *args, **kwargs)
        return cls._instance

However, this doesn't work on App Engine, since there are may be many servers and we would get one instance per server. So how would we do it for an app engine entity?

Something like:

class MySingleton(db.models):
    def __init__(self):
        all = MySingleton.all()
        if all.count() > 0:
             return all.fetch(1).get()

        super(MySingleton, self).__init__ (*args, **kwargs)

This leads to a recusion error, since get() calls __init__.

How we're going to use it:

We just want to represent a configuration file, ie:

{ 'sitename': "My site", 'footer': "This page owned by X"}
+1  A: 

__init__ cannot usefully return anything: just like in the first example, override __new__ instead!

Alex Martelli
This leads to a recursion error, same as with `__init__`. Though I haven't come across `__new__` before, and the docs aren't very clear on how to use it.
Paul Biggar
I think the docs at http://docs.python.org/reference/datamodel.html?highlight=__new__#object.__new__ are **quite** clear - what problem do you have with them? But the key tip to avoid recursion is to introduce an intermediate class that's the one that gets saved in the store -- `MySingleton` subclasses it, and is what's seen from all the rest of app-level code, but delegates all storage fetching and saving to the intermediate class (you could also do it the other way 'round -- with a class subclassing `MySingleton` being the one doing gets and puts).
Alex Martelli
Those are the docs we looked at. Its clear how it works normally, but not how it can help us. For the rest, I'm not following. A code sample would help immensely.
Paul Biggar
+1  A: 

Singletons are usually a bad idea, and I'd be interested to see what makes this an exception. Typically they're just globals in disguise, and apart from all the old problems with globals (eg. see http://c2.com/cgi/wiki?GlobalVariablesAreBad, in particular the bit at the top talking about non-locality, implicit coupling, concurrency issues, and testing and confinement), in the modern world you get additional problems caused by distributed and concurrent systems. If your app is potentially running across multiple servers, can you meaningfully have both instances of your application operate on the same singleton instance both safely and correctly?

If the object has no state of its, then the answer is yes, but you don't need a singleton, just a namespace.

But if the object does have some state, you need to worry about how the two application instances are going to keep the details synchronised. If two instances try reading and then writing to the same instance concurrently then your results are likely to be wrong. (eg. A HitCounter singleton that reads the current value, adds 1, and writes the current value, can miss hits this way - and that's about the least damaging example I can think of.)

I am largely unfamiliar with it, so perhaps Google App Engine has some transactional logic to handle all this for you, but that presumably means you'll have to add some extra stuff in to deal with rollbacks and the like.

So my basic advice would be to see if you can rewrite the algorithm or system without resorting to using a singleton.

Kylotan
We just want a global variable. Google App Engine handles the safety, correctness, and synchronisation, but there are a few tricky bits involved. For example, a single App Engine entity (you can think of it as an object) can actually be instantiated as multiple Python objects. Hence the question.
Paul Biggar
I downvoted for the suggestion of rewriting the algorithm. I need a hashtable accessible from anywhere in my program in which I keep settings. The common vocabulary is a singleton (a global would be an acceptable implementation of a singleton if it would work, which it wont). I would be curious how any algorithm with the requirement could be rewritten.
Paul Biggar
Finally, the suggestion of rewriting an entire system because someone feels singletons are bad practice is itself bad practice. Obviously, if you felt that the old reasons (I presume reentrancy is what you mean here) are valid, you could have argued this. I don't feel than anyone will learn by just saying "X are usually a bad idea" without qualifying it at least. (I apologize for ranting when you obviously put some time into your answer, but this sort of thing is common on SO - and almost every tech forum - and it drives me insane.)
Paul Biggar
Hi Paul: I'm afraid I have to stick with my original assessment: a singleton is just a Design-Patterns-authorised global variable and global variables are considered bad for well-documented reasons. This is a prime example, because different parts of the code (in this case, completely separate application instances) can try and modify the data without each other's knowledge.
Kylotan
GAE isn't going to fix the mutual exclusion problem for you. It does provide transactional access to the data store however, which is probably the way you'll need to go, and that means abandoning the idea of any directly shared state within your app and instead just sharing the data API. It also provides memcache which is a potentially quicker way of approaching the same goal, again by farming out the state to a separate process and requesting it on demand instead of attempting to share it directly.
Kylotan
@Kylotan: Of course its a global variable. That's the idea. Think of it as a configuration file if the idea of globals is unpalatable.
Paul Biggar
A configuration file isn't a global variable. A global variable is just one way to represent the values in that file. It's not the way I would choose because global variables cause problems.
Kylotan
What problems can it cause? This isn't C or C++. We can only have one of these objects. How else would you represent a configuration file?
Paul Biggar
The problems with globals have nothing to do with the language. I've edited my answer to provide a link that talks about this. As for your last question, you don't need to represent a configuration file, you need to read and write configuration settings that come from the file, which is a different abstraction. I humbly refer you back towards my comment on transactions and/or memcache regarding this.
Kylotan
No. We can't store files, since we're on App Engine, so we don't need to read or write from/to anything. It needs to be represented either in memory or an interface. Either way, we dont want there to be two.
Paul Biggar
I think you misread my comment - I wasn't suggesting using files.
Kylotan
+1  A: 

If you aren't going to store the data in the datastore, why don't you just create a module with variables instead of a db.Model?

Name your file mysettings.py and inside it write:

sitename = "My site"
footer = "This page owned by X"

Then the python module effectively becomes a "singleton". You can even add functions, if needed. To use it, you do something like this:

import mysettings
print mysettings.sitename

That's how django deals with this with their DJANGO_SETTINGS_MODULE

Update: It sounds like you really want to use a db.Model, but use memcached so you only retrieve one object once. But you'll have to come up with a way to flush it when you change data, or have it have a timeout so that it gets get'd occasionally. I'd probably go with the timeout version and do something like this in mysettings.py:

from google.appengine.api import memcache
class MySettings(db.Model):
   # properties...

def Settings():
    key = "mysettings"
    obj = memcache.get(key)
    if obj is None:
       obj = MySettings.all().get()  # assume there is only one
       if obj:
            memcache.add(key, zone, 360)
       else:
            logging.error("no MySettings found, create one!")
    return obj

Or, if you don't want to use memcache, then just store the object in a module level variable and always use the Settings() function to reference it. But then you'll have to implement a way to flush it until the interpreter instance is recycled. I would normally use memcached for this sort of functionality.

dar
We need the data synchronized over all instances, so we are certainly going to store it in the datastore.
Paul Biggar