views:

443

answers:

3

I have tree structure of widgets e.g. collection contains models and model contains widgets I wan to copy whole collection, copy.deepcopy is faster in comparison to 'pickle and de-pickle'ing the object but cPickle as being written in C is much faster, so

  1. so why shouldn't I(we) always be using cPickle instead of deepcopy?
  2. Is there any other copy alternative? because pickle is slower then deepcopy but cPickle is faster, so may be a C implementation of deepcopy will be the winner

Sample test code:

import copy
import pickle
import cPickle

class A(object): pass

d = {}
for i in range(1000):
    d[i] = A()

def copy1():
    return copy.deepcopy(d)

def copy2():
    return pickle.loads(pickle.dumps(d, -1))

def copy3():
    return cPickle.loads(cPickle.dumps(d, -1))

Timings:

>python -m timeit -s "import c" "c.copy1()"
10 loops, best of 3: 46.3 msec per loop

>python -m timeit -s "import c" "c.copy2()"
10 loops, best of 3: 93.3 msec per loop

>python -m timeit -s "import c" "c.copy3()"
100 loops, best of 3: 17.1 msec per loop
+2  A: 

You should be using deepcopy because it makes your code more readable. Using a serialization mechanism to copy objects in memory is at the very least confusing to another developer reading your code. Using deepcopy also means you get to reap the benefits of future optimizations in deepcopy.

First rule of optimization: don't.

wds
Second rule of optimization: Don't yet
voyager
Forget the outdated rules, replacing deepcopy by cpickle, makes my project rendering 25% faster and makes my customer happy :)
Anurag Uniyal
@wds, also i don't think it would be confusing if wrapper function to copy object is named deepcopyObject with a good comment
Anurag Uniyal
@anurag if you're having such speed problems, by all means. Though it still looks like more of a band-aid to me.
wds
+12  A: 

Problem is, pickle+unpickle can be faster (in the C implementation) because it's less general than deepcopy: many objects can be deepcopied but not pickled. Suppose for example that your class A were changed to...:

class A(object):
  class B(object): pass
  def __init__(self): self.b = self.B()

now, copy1 still works fine (A's complexity slows it downs but absolutely doesn't stop it); copy2 and copy3 break, the end of the stack trace says...:

  File "./c.py", line 20, in copy3
    return cPickle.loads(cPickle.dumps(d, -1))
PicklingError: Can't pickle <class 'c.B'>: attribute lookup c.B failed

I.e., pickling always assumes that classes and functions are top-level entities in their modules, and so pickles them "by name" -- deepcopying makes absolutely no such assumptions.

So if you have a situation where speed of "somewhat deep-copying" is absolutely crucial, every millisecond matters, AND you want to take advantage of special limitations that you KNOW apply to the objects you're duplicating, such as those that make pickling applicable, or ones favoring other forms yet of serializations and other shortcuts, by all means go ahead - but if you do you MUST be aware that you're constraining your system to live by those limitations forevermore, and document that design decision very clearly and explicitly for the benefit of future maintainers.

For the NORMAL case, where you want generality, use deepcopy!-)

Alex Martelli
A: 

Even faster would be to avoid the copy in the first place. You mention that you are doing rendering. Why does it need to copy objects?

Ned Batchelder
yes ideally it wouldn't need copy, as view(rendering) and model will be de-coupled but in my case, rendering does modify the model hence i need to copy model before rendering so original doesn't get modified
Anurag Uniyal
I don't mean to beat a dead horse, but fixing the problem where rendering modifies the model will make you very happy.
Ned Batchelder
I agree but it would be a costly affair to change so much code, as discussion here is not possible I have added a questionhttp://stackoverflow.com/questions/1414246/how-to-decouple-model-view-for-widgets
Anurag Uniyal