views:

1440

answers:

12

This is really the first thing that I have written in python. I come from Java background. I don't want to just learn how to program java code with Python syntax. I want to learn how to program in a pythonic paradigm.

Could you guys please comment on how I can make the following code more pythonic?

from math import sqrt

# recursively computes the factors of a number
def factors(num):
    factorList = []
    numroot = int(sqrt(num)) + 1
    numleft = num
    # brute force divide the number until you find a factor
    for i in range(2, numroot):
        if num % i == 0:
            # if we found a factor, add it to the list and compute the remainder
            factorList.append(i)
            numleft = num / i
            break
    # if we didn't find a factor, get out of here!
    if numleft == num: 
        factorList.append(num)
        return factorList
    # now recursively find the rest of the factors
    restFactors = factors(numleft)
    factorList.extend(restFactors)

    return factorList

# grabs  all of the twos in the list and puts them into 2 ^ x form
def transformFactorList(factorList):
    num2s = 0
    # remove all twos, counting them as we go
    while 2 in factorList:
        factorList.remove(2)
        num2s += 1
    # simply return the list with the 2's back in the right spot
    if num2s == 0: return factorList
    if num2s == 1:
        factorList.insert(0, 2)
        return factorList
    factorList.insert(0, '2 ^ ' + str(num2s))
    return factorList

print transformFactorList(factors(#some number))
+2  A: 

Here's what jumps out at me:

def transformFactorList(factorList):
    oldsize = len(factorList)
    factorList = [f for f in factorList if f != 2]
    num2s = oldsize - len(factorList)
    if num2s == 0:
        return []
    if num2s == 1:
        return [2]+factorList
     return ['2 ^ %s' % num2s] + [factorList]

The form [f for f in factorList if f != 2] is called a list-comprehension.

mbac32768
+8  A: 

A few comments:

  1. I would replace range() with xrange(); when you call range(), it allocates the entire range all at once, whereas when you iterate over xrange(), it returns each result one at a time, saving memory.
  2. Don't put expressions after conditionals on the same line (if num2s -- 0: return factorList). It makes it harder to see at a glance what it's doing (that it's a block).
  3. Don't be afraid to use modules. The [sympy][1] module already has code to compute factors, which may simplify your code by eliminating most of it.
  4. Python's string formatting is simple and effective.

For example:

factorList.insert(0, '2 ^ ' + str(num2s))

could be changed to

factorlist.insert(0, '2 ^ %s' % num2s)

All in all, I don't find your code to be extensively un-pythonic. Just make sure you want to use floor division, because that's what tends to happen by default with integer values. Otherwise, you'll need to fix up the division operator:

from __future__ import division

A sometimes-frustrating caveat of the language.

Dan Udey
+4  A: 
from itertools import takewhile

def transform_factor_list(factor_list):
    num_2s = len(list(takewhile(lambda e: e == 2, factor_list)))
    if num_2s > 1:
        factor_list[:num_2s] = ["2 ^ %i" % (num_2s, )]
    return factor_list

That's what I would make out of the second function.

Most pythonic changes:

  • PEP-8 compatible naming
  • slicing (and assigning to slices)
  • iterators
  • string formatting

The function assumes that the input is ordered, which is fulfilled by factors.

Edit: removed special cases for some lists, more compact this way

Torsten Marek
why would you use itertools.takewhile instead of a list comprehension with an "if" expression?
Claudiu
@Claudiu: takewhile stops early, a listcomp must run to the end. If you don't care about that, a simple `factor_list.count(2)` would do.
Beni Cherniavsky-Paskin
+17  A: 

Just use 'import math' and 'math.sqrt()' instead of 'from math import sqrt' and 'sqrt()'; you don't win anything by just importing 'sqrt', and code quickly gets unwieldy with too many from-imports. Also, things like reload() and mocking out for tests break a lot faster when you use from-import a lot.

The divmod() function is a convenient way to perform both division and modulo. You can use for/else instead of the separate check on numleft. Your factors function is a natural candidate for a generator. xrange() was already mentioned in another answer. Here's it all done that way:

import math

# recursively computes the factors of a number as a generator
def factors(num):
    numroot = int(math.sqrt(num)) + 1
    # brute force divide the number until you find a factor
    for i in xrange(2, numroot):
        divider, remainder = divmod(num, i)
        if not remainder:
            # if we found a factor, add it to the list and compute the
            # remainder
            yield i
            break
    else:
    # if we didn't find a factor, get out of here!
        yield num
        return
    # now recursively find the rest of the factors
    for factor in factors(divider):
        yield factor

Using a generator does mean you can only iterate over the result once; if you simply want a list (like you do in translateFactorsList) you will have to wrap the call to factors() in list().

Thomas Wouters
Whee! Generators! I was too slow... :-( Too bad I'm out of votes for today...
Torsten Marek
In very math intensive code, I tend to import functions as from math import cos, sin, etc... because otherwise, functions can obfuscated very fast by all the math. prefixes
Kena
+20  A: 

There is an excellent primer by David Goodger called "Code Like a Pythonista" here. A couple of things from that text re naming (quoting):

  • joined_lower for functions, methods, attributes

  • joined_lower or ALL_CAPS for constants

  • StudlyCaps for classes

  • camelCase only to conform to pre-existing conventions

bvmou
naming convention are important, but I think the poster asks more for Python coding style and habits
CharlesB
+15  A: 

One other thing you might want to look at is the docstring. For example, the comment for this function:

# recursively computes the factors of a number
def factors(num):

Could be converted into this:

def factors(num):
    """ recursively computes the factors of a number"""

It's not really 100% necessary to do it this way, but it's a good habit to get into in case you ever start using something along the lines of pydoc.

You can also do this:

docstring.py

"""This is a docstring"""

at the command line:

>>> import docstring
>>> help(docstring)

results:

Help on module docstring:

NAME
    docstring - This is a docstring

FILE
    /Users/jason/docstring.py
Jason Baker
maybe show an example of from example import factors then help(factors) with the two
bvmou
+3  A: 

Don't be afraid of list comprehensions. Switching from Java to Python and discovering them was a good day.

For the factors function, maybe something like this:

def factors(num):
    return [i for i in xrange(1, num+1) if num % i == 0]

Probably not the best code but it's short and easy to understand.

Good luck with Python, it's a great language.

Casey
Umm... Casey this was from September. :P
Evan Fosmark
Evan, it was on the little thing on the right "Related" :) thought I might say something hehehe
Casey
I appreciate the help. I still check on my old stuff.
jjnguy
Nice, but this finds divisors, not factors. You give composite divisors too, and you give repeated factors once: 12 = 4 * 3 * 3 but your factors(12) == [1, 2, 3, 4, 6, 12]
Beni Cherniavsky-Paskin
[Make that 12 == 2 * 2 * 3 of course]
Beni Cherniavsky-Paskin
+2  A: 

Since this post seems to be resurrected by Casey (lol), I'll add in my 2 cents.

Go over everything in PEP-8. It helped me out substantially when I had code formatting issues.

Evan Fosmark
+3  A: 

this is how I'd do this...

import itertools
import collections

def factorize(n):
 # ideally an iterator of prime numbers
 # this'll work though
 divisors = itertools.count(2)

 divisor = divisors.next()
 while True:
  if divisor**2 > n:
   yield n
   break

  a,b = divmod(n, divisor)

  if b == 0:
   yield divisor
   n = a
  else:
   divisor = divisors.next()

def compress(factors):
 summands = collections.defaultdict(lambda: 0)

 for factor in factors:
  summands[factor] += 1

 return [(base, summands[base]) for base in sorted(summands)]

def tostring(compressed):
 return ' * '.join("%d**%d" % factor for factor in compressed)
+4  A: 

Based on chris's answer, slightly simplified:

  • for instead of external while
  • internal while to preserve ability to use same divisor repeatedly
  • use itertools.groupby simplifies compress() by much
  • fix a small bug in tostring()

HTH:

import itertools

def factorize(n):
    # ideally an iterator of prime numbers
    # this'll work though
    divisors = itertools.count(2)

    for divisor in divisors:
        # This condition is very clever!
        # Note that `n` is decreasing, while `divisor` is increasing.
        # And we know that `n` is not divisible by anything smaller,
        # so this stops as soon as the remaining `n` is obviously prime.
        if divisor**2 > n:
            yield n
            break

        while n % divisor == 0:
            yield divisor
            n //= divisor

def compress(factors):
    for (factor, copies) in itertools.groupby(factors):
        # The second object yielded by groupby is a generator of equal factors.
        # Using list() to count its length.
        power = len(list(copies))
        yield (factor, power)

def tostring(compressed):
    return ' * '.join("%d**%d" % (factor, power) for (factor, power) in compressed)

# test
assert tostring(compress(factorize(12))) == '2**2 * 3**1'
Beni Cherniavsky-Paskin
+1  A: 

Using recursion (where not necessary) is not pythonic. Python doesn't have tail recursion elimination and flat is better than nested.

When in doubt, try import this

update: by popular request, here goes the iterative factorization (sigh):

"""returns an iterator of tuples (factor, power) such that 
reduce(operator.mul, (factor**power for factor, power in factors(n))) == n """
def factors(n):
    i = 2
    while n > 1:
        p = 0
        while n > 1 and n % i == 0:
            p += 1
            n /= i
        if p:
            yield (i, p)
        i += 1
fortran
could you suggest an alternative to recursion in this case?
littlegreen
are you joking? the "natural" way to express factorization is iterative, not recursive... just loop over primes (or all naturals, for it doesn't matter) and if the number is divisible... OK, I think it's faster to update the answer, but it's really lame.
fortran
+1  A: 

I'd use a list comprehension to get the twos out:

def transformFactorList(factorList):
    twos = [x for x in factorList if x == 2]
    rest = [x for x in factorList if x != 2]
    rest.insert(0, "2 ^ %d" % len(twos))
    return rest

Note that this will give you 2^0 and 2^1, which your code didn't. What you're doing with the twos seems arbitraty (sometimes you get a string, sometimes a number, sometimes nothing), so I figured that would be fine. You can change that easily if you want:

def transformFactorList(factorList):
    twos = [x for x in factorList if x == 2]
    rest = [x for x in factorList if x != 2]
    if twos:
        rest.insert(0, 2 if len(twos)==1 else "2 ^ %d" % len(twos))
    return rest
Claudiu