views:

140

answers:

1

Hi all,

I am considering the use of Quantities to define a number together with its unit. This value most likely will have to be stored on the disk. As you are probably aware, pickling has one major issue: if you relocate the module around, unpickling will not be able to resolve the class, and you will not be able to unpickle the information. There are workarounds for this behavior, but they are, indeed, workarounds.

A solution I fantasized for this issue would be to create a string encoding uniquely a given unit. Once you obtain this encoding from the disk, you pass it to a factory method in the Quantities module, which decodes it to a proper unit instance. The advantage is that even if you relocate the module around, everything will still work, as long as you pass the magic string token to the factory method.

Is this a known concept?

+1  A: 

Looks like an application of Wheeler's First Principle, "all problems in computer science can be solved by another level of indirection" (the Second Principle adds "but that will usually create another problem";-). Essentially what you need to do is an indirection to identify the type -- entity-within-type will be fine with pickling-like approaches (you can study the sources of pickle.py and copy_reg.py for all the fine details of the latter).

Specifically, I believe that what you want to do is subclass pickle.Pickler and override the save_inst method. Where the current version says:

    if self.bin:
        save(cls)
        for arg in args:
            save(arg)
        write(OBJ)
    else:
        for arg in args:
            save(arg)
        write(INST + cls.__module__ + '\n' + cls.__name__ + '\n')

you want to write something different than just the class's module and name -- some kind of unique identifier (made up of two string) for the class, probably held in your own registry or registries; and similarly for the save_global method.

It's even easier for your subclass of Unpickler, because the _instantiate part is already factored out in its own method: you only need to override find_class, which is:

def find_class(self, module, name):
    # Subclasses may override this
    __import__(module)
    mod = sys.modules[module]
    klass = getattr(mod, name)
    return klass

it must take two strings and return a class object; you can do that through your registries, again.

Like always when registries are involved, you need to think about how to ensure you register all objects (classes) of interest, etc, etc. One popular strategy here is to leave pickling alone, but ensure that all moves of classes, renames of modules, etc, are recorded somewhere permanent; this way, just the subclassed unpickler can do all the work, and it can most conveniently do it all in the overridden find_class -- bypassing all issues of registration. I gather you consider this a "workaround" but to me it seems just an extremely simple, powerful and convenient implementation of the "one more level of indirection" concept, which avoids the "one more problem" issue;-).

Alex Martelli