views:

82

answers:

1

I am making a small program and at some point from each row of a matrix I need to subtract the average of the row itself. Quite a standard renormalization procedure.

Note in the code

def subtractaverage(data):
    datanormalized=[]
    for row in data:
        average_row=sum(row)/len(row)
        print "average=",average_row
#       renormalized_row=[cell-average_row for cell in row]
        renormalized_row=[-average_row+cell for cell in row]        
        datanormalized.append(renormalized_row) 
    matrixnormalized=np.array(datanormalized)
    return matrixnormalized

The lines: # renormalized_row=[cell-average_row for cell in row] renormalized_row=[-average_row+cell for cell in row]

I first tried the first line (cell-average_row) and it did NOT work. The result was that renormalized_row ended up being equal to row.

Then the second line instead worked. SO somehow it seem that the compiler is interpreting [cell-average_row for cell in row] as [cell for cell in row].

But if I write:

renormalized_row=[cell-100 for cell in row] 

it works fine (and produces a new list with the value 100 subtracted from each cell. I tried another small program, then:

rs=range(10)
val=5
t=[r-val for r in rs]
print t,rs

This also works and produces

[-5, -4, -3, -2, -1, 0, 1, 2, 3, 4] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

as it should.

So now I am at a loss. Yes I can use renormalized_row=[-average_row+cell for cell in row] but I would like to understand what is going on. Why this apparent inconsistency in the way the expression is interpreted.

I am using python2.6.5 (2.6.6 won't have a .dmg for Mac) on a OSX 10.6.4

Thanks

Trying the program later the day, on another sets of data, it actually worked. Testing it again on the original data it works again. I am even more confused. But I know even miss the casus belli to show that something was not working as it should.

Can we please close this question

+1  A: 

I guess the problem is the integer division (if row consists of integers only)

average_row=sum(row)/len(row)

which will give you an average of 0 if the length of the row is greater than the sum. Try

average_row=sum(row)/float(len(row))

instead.

jellybean
Thanks. It is not the case. As I print the average before: print "average=",average_row I do check that the average is not equal to zero. Thanks for trying to help, nevertheless!
Pietro Speroni
Ok, but integer division will still give you imprecise results, so doing a float conversion will he helpful in any case.
jellybean
thanks. Actually all the numbers I am using are all floats.
Pietro Speroni
Could you print the result of (cell-average_row)? average_row may be to small and the whole expression evaluates to cell (I find it strange but I don't know enough about python's float implementation).
jumpifzero
hi jumpifzero. Now the whole program works fine. I don't know what changed. Let's just all forget all this :-/
Pietro Speroni