ansaurus

Question

How do you make this code more pythonic?

Answer 1

A:

You could make use of the with statement.

Geo 2009-06-17 14:02:42

except that nobody *gets* the with statement yet...

Daren Thomas 2009-06-17 14:48:25

-1: in what way? Just randomly? or with something as aPurpose?

S.Lott 2009-06-17 14:56:29

for file handling.

Geo 2009-06-17 19:31:41

yeah, he could also make use httplib to add some spice.. ?

yairchu 2009-06-17 19:33:00

that's the best you could come up with?

Geo 2009-06-17 20:06:04

@Geo: it was the best I could come up with. sorry for the sarcasm :)

yairchu 2009-06-17 20:45:52

Answer 2

+9 A:

One obvious change is to get rid of the "for i in range(1, 100):" and just iterate over the file lines. To iterate over both files (xfile and yfile), zip them. ie replace that block with something like:

 import itertools

 for xline, yline in itertools.izip(xfile, yfile):
    s= xline.split("  ")
    x[0] = float(s[1])
    x[1] = float(s[2])
    y = float(yline)
    ...

(This is assuming the file is 100 lines, (ie. you want the whole file). If you're deliberately restricting to the first 100 lines, you could use something like:

 for i, xline, yline in itertools.izip(range(100), xfile, yfile):

However, its also inefficient to iterate over the same file 6 times - better to load it into memory in advance, and loop over it there, ie. outside your loop, have:

xfile = open("q1x.dat", "r")
yfile = open("q1y.dat", "r")
data = zip([line.split("  ")[1:3] for line in xfile], map(float, yfile))

And inside just:

for (x1,x2), y in data:
    x[0] = x1
    x[1] = x2
     ...

Brian 2009-06-17 14:23:22

that should be line.split(" ")[1:3] as yairchu did it. Split two spaces, this site edits my code.

MercerKernel 2009-06-17 20:30:41

Oops, you're right - I missed that. Updated.

Brian 2009-06-18 08:46:58

Answer 3

A:

the code that reads the files into lists could be drastically simpler

for line in open("q1x.dat", "r"):
    x = map(float,line.split("  ")[1:])
y = map(float, open("q1y.dat", "r").readlines())

Nathan 2009-06-17 14:25:21

you are overriding x all the time.perhaps you intended x = [map(float, line.split(" ")[1:] for line in open("q1x.dat", "r") ??

yairchu 2009-06-17 14:28:19

wait doesn't his code do that too?

Nathan 2009-06-17 15:00:42

@Nathan: to clear some confusion - lets call your "y" "ys" as it has a list of all "y"s in OP's code. your "x" however could not be renamed to "xs" as it has the same values of his "x". then it would seem odd that you calculate "x" and "ys" and not "xs" and "ys". I hope I'm clear..

yairchu 2009-06-17 19:29:47

Answer 4

+3 A:

x = matrix([[0.],[0],[1]])
theta = matrix(zeros([3,1]))
for i in range(5):
  grad = matrix(zeros([3,1]))
  hess = matrix(zeros([3,3]))
  [xfile, yfile] = [open('q1'+a+'.dat', 'r') for a in 'xy']
  for xline, yline in zip(xfile, yfile):
    x.transpose()[0,:2] = [map(float, xline.split("  ")[1:3])]
    y = float(yline)
    hypoth = 1 / (1 + math.exp(theta.transpose() * x))
    grad += (y - hypoth) * x
    hess -= hypoth * (1 - hypoth) * x * x.transpose()
  theta += inv(hess) * grad
print "done"
print theta

yairchu 2009-06-17 14:26:48

I don't know, the code works fine. When they're matrices it seems to do inner product. I got the code from here (http://www.scipy.org/SciPy_Tutorial)

MercerKernel 2009-06-17 18:43:52

@MercerKernel: cool - I learned something new! my use of arrays instead of matrices forced me do use dot. with matrices you can use "*"! I fixed the code and also allowed myself to make theta = -old_theta to simplify

yairchu 2009-06-17 19:27:18

hmm, your map line complains about dimensions. I had to rewrite it as x[:2] = array([xline.split(" ")[1:3]], dtype=float).transpose() Same thing on the math.exp line i think you need theta.transpose()

MercerKernel 2009-06-17 19:45:40

@MercerKernel: thanks. I think I fixed it. I don't have input files to run it with so I didn't test it.. :)

yairchu 2009-06-17 20:44:23

hah, ok. It's still not very pythonic, but it's much more valuable :)

MercerKernel 2009-06-17 20:56:33

several things make it *more* pythonic than your post: range(5) instead of range(1,6) because "real men" enumerate from 0. iterating files to get lines. zip. "a +=" instead of "a = a +". not closing the files and letting the garbage collector do it :)

yairchu 2009-06-17 21:35:50

MercerKernel 2009-06-17 22:03:45

This isn't so much an improvement in its "pythonicity" but you should never, ever, *ever* use inv(mat) * vector to solve a linear system. Use solve(mat, vector) - this does less floating point ops and generally leads to less error in the result due to rounding.

dwf 2009-07-22 01:38:10

@dwf: I was refactoring the original code without changing it. and that's what the original code does.

yairchu 2009-07-22 07:12:42

Answer 5

+3 A:

the matrixes kept rounding to integers until I initialized one value to 0.0. Is there a better way?

At the top of your code:

from __future__ import division

In Python 2.6 and earlier, integer division always returns an integer unless there is at least one floating point number within. In Python 3.0 (and in future division in 2.6), division works more how we humans might expect it to.

If you want integer division to return an integer, and you've imported from future, use a double //. That is

from __future__ import division
print 1//2 # prints 0
print 5//2 # prints 2
print 1/2  # prints 0.5
print 5/2  # prints 2.5

Thomas Weigel 2009-06-17 19:19:45

ansaurus

tags:

views:

answers:

How do you make this code more pythonic?

related questions