tags:

views:

48

answers:

3

Simplified version of my code:

sequence = [['WT_1', 'AAAAAAAA'], ['WT_2', 'BBBBBBB']]

def speciate(sequence):
    lineage_1 = []
    lineage_2 = []

    for i in sequence:
        lineage_1.append(i)
    for k in sequence:
        lineage_2.append(k)

    lineage_1[0][0] = 'L1_A'
    lineage_1[1][0] = 'L1_B'
    lineage_2[0][0] = 'L2_A'
    lineage_2[1][0] = 'L2_B'

    print lineage_1
    print lineage_2

speciate(sequence)

outputs:

[['L2_A', 'AAAAAAAA'], ['L2_B','BBBBBBB']]
[['L2_A','AAAAAAAA'], ['L2_B','BBBBBBB']]

when I would expect to get this:

[['L1_A', 'AAAAAAAA'], ['L1_B','BBBBBBB']]
[['L2_A','AAAAAAAA'], ['L2_B','BBBBBBB']]

Does anybody know what the problem is?

+2  A: 

You have to make a deep copy (or shallow copy suffices in this case) when you append. Else lineage_1[0][0] and lineage_2[0][0] reference the same object.

from copy import deepcopy
for i in sequence:
   lineage_1.append(deepcopy(i))
for k in sequence:
   lineage_2.append(deepcopy(k))

See also: http://docs.python.org/library/copy.html

Michael
+1 but in addition you don't need to do the append loop if you use deepcopy. For example, lineage_1 = deepcopy(sequence) is enough.
dcolish
true, it certainly is shorter. Just wanted to point out the problem.
Michael
A: 

You are appending list objects in your for-loops -- the same list object (sequence[0]).

So when you modify the first element of that list:

lineage_1[0][0] = 'L1_A'
lineage_1[1][0] = 'L1_B'
lineage_2[0][0] = 'L2_A'
lineage_2[1][0] = 'L2_B'

you're seeing it show up as modified in both the lineage_X lists that contain copies of the list that is in sequence[0].

Do something like:

import copy
for i in sequence:
    lineage_1.append(copy.copy(i))
for k in sequence:
    lineage_2.append(copy.copy(k))

this will make copies of the sublists of sequence so that you don't have this aliasing issue. (If the real code has deeper nesting, you can use copy.deepcopy instead of copy.copy.)

bstpierre
A: 

Consider this simple example:

>>> aa = [1, 2, 3]
>>> bb = aa
>>> bb[0] = 999
>>> aa
[999, 2, 3]

What happened here?

"Names" like aa and bb simply reference the list, the same list. Hence when you change the list through bb, aa sees it as well. Using id shows this in action:

>>> id(aa)
32343984
>>> id(bb)
32343984

Now, this is exactly what happens in your code:

for i in sequence:
    lineage_1.append(i)
for k in sequence:
    lineage_2.append(k)

You append references to the same lists to lineage_1 and lineage_2.

Eli Bendersky